Asynchronously and recursively crawl links Javascript - javascript

I am making a Blog using Notion as a content management system. There is an unofficial API provided by notion-api-js, with a function getPagesByIndexId(pageId) that returns a page's content, its subpages' contents, and, its parent's contents. So, an array of objects is returned, looking like:
[
{ moreStuff: ...,
Attributes: { slug: "home page slug", id: "home page id", moreStuff... },
},
{ moreStuff: ..., Attributes: { slug: "parent to homepage", id: "homepage's parent id", moreStuff: ... }
{ moreStuff: ..., Attributes: { slug: "sub page slug 0", id: "sub page id 0", moreStuff: ... } },
{ moreStuff: ..., Attributes: { slug: "sub page slug 1", id: "sub page id 1", moreStuff: ... } },
];
I want to build a tree that is created by recursively looping through the given id and the ids that getPagesByIndexId(given id) return to extract all slugs and ids. The function stops recursing when getPagesByIndexId(id) returns objects with ids already crawled through.
I use a crawledIdsList array to keep track of ids already crawled through, fetchPage is the same as getPagesByIndex, and I use flatmap to ignore empty []s passed by from the map function. Thanks in advance! To run this locally on node, the dependency required is npm i notion-api-js
The tree structure of the page I provided the ID with (I provided the id to the "Dev" page in homePageId) looks like:
My current code follows. It hits the "end" and returns successfully, but it is returning many pages a lot more than once.
const Notion = require("notion-api-js").default;
const token_v2 = "543f8f8529f361ab34596f5be9bc972b96ab8d8dc9e6e41546c05751b51a18a6c7d40b689d80794babae3a91aeb5dd5e47c34edb724cc356ceceacf3a8061158bfab92e68b7614516a0699295990"
const notion = new Notion({
token: token_v2,
});
const fetchPage = (id) => {
return notion.getPagesByIndexId(id);
};
const homePageId = "3be663ea-90ce-4c45-b04e-41161b992dda"
var crawledIdsList = [];
buildTree(tree={}, homePageId).then(tree => {console.log(tree)})
function buildTree(tree, id) {
return fetchPage(id).then((pages) => {
tree.subpages = [];
tree.slug = pages[0].Attributes.slug;
tree.id = id;
crawledIdsList.push(id);
return Promise.all(
pages.flatMap((page) => {
var currentCrawlId = page.Attributes.id;
if (crawledIdsList.indexOf(currentCrawlId) === -1) {
// executes code block if currentCrawlId is not used in fetchPage(id) yet
crawledIdsList.push(currentCrawlId);
return buildTree({}, currentCrawlId).then((futureData) => {
tree.subpages.push(futureData);
return tree;
});
} else {
if (crawledIdsList.indexOf(id) >= 0) {
return [];
}
return tree; // end case. futureData passed to earlier calls is tree, which looks like {subpages: [], slug: someSlug, id: someId}
}
})
)
});
}

Related

Trying to write a recursive asynchronous search in JavaScript

I am trying to write some code that searches through a bunch of objects in a MongoDB database. I want to pull the objects from the database by ID, then those objects have ID references. The program should be searching for a specific ID through this process, first getting object from id, then ids from the object.
async function objectFinder(ID1, ID2, depth, previousList = []) {
let route = []
if (ID1 == ID2) {
return [ID2]
} else {
previousList.push(ID1)
let obj1 = await findObjectByID(ID1)
let connectedID = obj1.connections.concat(obj1.inclusions) //creates array of both references to object and references from object
let mapPromises = connectedID.map(async (id) => {
return findID(id) //async function
})
let fulfilled = await Promise.allSettled(mapPromises)
let list = fulfilled.map((object) => {
return object.value.main, object.value.included
})
list = list.filter(id => !previousList.includes(id))
for (id of list) {
await objectFinder(id, ID2, depth - 1, previousList).then(result => {
route = [ID1].concat(result)
if (route[route.length - 1] == ID2) {
return route
}})
}
}
if (route[route.length - 1] == ID2) {
return route
}
}
I am not sure how to make it so that my code works like a tree search, with each object and ID being a node.
I didn't look too much into your code as I strongly believe in letting your database do the work for you if possible.
In this case Mongo has the $graphLookup aggregation stage, which allows recursive lookups. here is a quick example on how to use it:
db.collection.aggregate([
{
$match: {
_id: 1,
}
},
{
"$graphLookup": {
"from": "collection",
"startWith": "$inclusions",
"connectFromField": "inclusions",
"connectToField": "_id",
"as": "matches",
}
},
{
//the rest of the pipeline is just to restore the original structure you don't need this
$addFields: {
matches: {
"$concatArrays": [
[
{
_id: "$_id",
inclusions: "$inclusions"
}
],
"$matches"
]
}
}
},
{
$unwind: "$matches"
},
{
"$replaceRoot": {
"newRoot": "$matches"
}
}
])
Mongo Playground
If for whatever reason you want to keep this in code then I would take a look at your for loop:
for (id of list) {
await objectFinder(id, ID2, depth - 1, previousList).then(result => {
route = [ID1].concat(result);
if (route[route.length - 1] == ID2) {
return route;
}
});
}
Just from a quick glance I can tell you're executing this:
route = [ID1].concat(result);
Many times at the same level. Additional I could not understand your bottom return statements, I feel like there might be an issue there.

Javascript - filtering a list: how can I find an intersection between an array with objects including array and an array?

How can I filter a list (array with objects) with a filter list (array) and find intersections? I add to the filter array every time a user checks the checkbox clicking on particular filter. When user unchecks the checkbox I remove from filter array. Somehow whateever i try doing, i always return the entire reviews array including ALL not filtered items. Why? Thanks!!
const reviews = [
{
title: "item 1",
filter_results: {
features: ["message", "call"],
pricing: ["Business", "Free", "Whatever"],
rating: [1]
}
},
{
title: "item 2",
filter_results: {
features: ["call", "copy", "paste"],
pricing: ["Business"],
rating: [1]
}
},
{
title: "item 3",
filter_results: {
features: ["copy", "connect", "wifi"],
pricing: ["Free",
rating: [2]
}
}
]
const filteredReviews = {
pricing_options: ["Business"],
popular_features: ["copy, call"],
rating: [1, 2]
}
const update = (reviews, categoryName) => {
if (categoryName) {
return reviews.filter(review => {
return review.filter_results[categoryName].filter(value => {
if (filteredReviews[categoryName].includes(value)) {
return review
}
})
})
} else {
return reviews
}
}
update(reviews, "pricing")
Return a boolean on filter callback, and do a better filtering mechanism:
const update = (reviews, filters) => {
if (filters) {
return reviews.filter(review =>
Object.entries(filters)
// Change to `some` if the filters are OR'ed instead of AND'ed
.every(
([filter_key, filter_values]) =>
// Change `some` to `every` if all elements in the
// userFilter[*] array MUST be matched instead of some of them
filter_values.some( (filter_value) =>
review.filter_results[filter_key]
.includes(filter_value)
)
)
)
} else {
return reviews
}
}
// Fix variables names:
// - `userFilters` contains the filters selected by the user
// - `filteredReviews` contains the array of reviews, resulting from
// filtering the reviews using the `userFilters`
// Fix key names: Use same keys than in reviews, instead of:
// - `pricing_options` => `pricing`
// - `popular_features` => `features`
const userFilters = {
pricing: ["Business"],
// Transformed/fixed to 2 values. Was it a typo?
features: ["copy", "call"],
};
const filteredReviews = update(reviews, userFilters);
Filter callback function should return a "boolean", you are returning arrays which evaluate always to "true".

Find matching item across multiple store arrays in VueX

Currently when I want to find single item in an array that is in store I use this:
this.matched = this.$store.state.itemlist.find(itemId=> {
return itemId.id == "someid";
});
Lets says I want to go over multiple arrays to find the matching item given provided ID? Like i have itemlist1 itemlist2 itemgetter()... Some of the arrays are getters ( but I think it doesnt change much). So basically I want to search over different state and getter items in this component instead of searching over one as in example above.
if you just want to find if its exist in one the arrays you can simply write function like this
function find(search,...arrs){
return arrs.flat(1).find(item => item == search)
}
this function merge all arrays to one long array and search in it
example of usage
let a=[1,2,3,4]
let b=[5,6,7,8]
let c=[9,10,11,12]
let i=find(6,a,b)
console.log(i)
Using one object to group all the arrays, so that will be possible to iterate over them. The idea is something like below:
const store = new Vuex.Store({
state: {
itemsGroupArrays: {
items1: [{ id: 1, text: "item1 - 1" }, { id: 2, text: "item1 - 2" }],
items2: [{ id: 3, text: "item2 - 1" }, { id: 4, text: "item2 - 2" }]
}
},
getters: {
getItemByIdFromStateGroupArrays: state => (id) => {
let returnedItem = null;
Object.values(state.itemsGroupArrays).forEach((itemStateArray) => {
if (itemStateArray.some(item => item.id === id)) {
returnedItem = itemStateArray.find(item => item.id === id);
}
})
return returnedItem;
}
}
});

Go through object when keys are know

I have object
var routes = {
"home":{
hash: "/home",
children: {
"just-home": {
hash: "/home/just-home",
children: {...}
},
"sub-homea": {
hash: "/home/sub-homea",
children: {...}
}
},
"contact":{
hash: "/contact",
children: {
"just-contact": {
hash: "/contact/just-contact",
children: {...}
},
"sub-contact": {
hash: "/contact/sub-contact",
children: {...}
}
}
}
How i can set object to just-contact.children when i know for example - that first key is contact, and next just-contat.. ? I need to assign this object dynamically because the known keys will be all time different. So i need use any loop. something like this -
const pathArray = [contact,just-contact]
Object.keys(routes).map(function (item) {
if (routes[item] === pathArray[counter]){
ob = routes[item];
counter++;
}
})
but this will loop only once and it won't go to deep.
UPDATE for more clean explanation -
I will read from path location (localhost:3000/contact/just-contact) the values (contact,just-contact) , which i will save to array (pathArray=[contact,just-contact]), when the location path will be change, the keys in array will be change too. And i need to find children of last key, in this example children of just-contact key
Found simple solution -
pathArray.map(function (item) {
if (obj[item].hash === item){
obj = obj[item].children;
}
})

Recursion maintaining ancestors/parents nested object in JavaScript

I have a very deep nested category structure and I am given a category object that can exist at any depth. I need to be able to iterate through all category nodes until I find the requested category, plus be able to capture its parent categories all the way through.
Data Structure
[
{
CategoryName: 'Antiques'
},
{
CategoryName: 'Art',
children: [
{
CategoryName: 'Digital',
children: [
{
CategoryName: 'Nesting..'
}
]
},
{
CategoryName: 'Print'
}
]
},
{
CategoryName: 'Baby',
children: [
{
CategoryName: 'Toys'
},
{
CategoryName: 'Safety',
children: [
{
CategoryName: 'Gates'
}
]
}
]
},
{
CategoryName: 'Books'
}
]
Code currently in place
function findCategoryParent (categories, category, result) {
// Iterate through our categories...initially passes in the root categories
for (var i = 0; i < categories.length; i++) {
// Check if our current category is the one we are looking for
if(categories[i] != category){
if(!categories[i].children)
continue;
// We want to store each ancestor in this result array
var result = result || [];
result.push(categories[i]);
// Since we want to return data, we need to return our recursion
return findCategoryParent(categories[i].children, category, result);
}else{
// In case user clicks a parent category and it doesnt hit above logic
if(categories[i].CategoryLevel == 1)
result = [];
// Woohoo...we found it
result.push(categories[i]);
return result;
}
}
}
Problem
If I return my recursive function it will work fine for 'Art' and all of its children..but since it returns, the category Baby never gets hit and therefor would never find 'Gates' which lives Baby/Safety/Gates
If I do not return my recursive function it can only return root level nodes
Would appreciate any recommendations or suggestions.
Alright, I believe I found a solution that appears to work for my and not sure why my brain took so long to figure it out...but the solution was of course closure.
Essentially I use closure to keep a scoped recursion and maintain my each iteration that it has traveled through
var someobj = {
find: function (category, tree, path, callback) {
var self = this;
for (var i = tree.length - 1; i >= 0; i--) {
// Closure will allow us to scope our path variable and only what we have traversed
// in our initial and subsequent closure functions
(function(){
// copy but not reference
var currentPath = path.slice();
if(tree[i] == category){
currentPath.push({name: tree[i].name, id: tree[i].id});
var obj = {
index: i,
category: category,
parent: tree,
path: currentPath
};
callback(obj);
}else{
if(tree[i].children){
currentPath.push({name: tree[i].name, id: tree[i].id});
self.find(category, tree[i].children, currentPath, callback);
}
}
})(tree[i]);
}
},
/**
* gets called when user clicks a category to remove
* #param {[type]} category [description]
* #return {[type]} [description]
*/
removeCategory: function (category) {
// starts the quest for our category and its ancestors
// category is one we want to look for
// this.list is our root list of categoires,
// pass in an intial empty array, each closure will add to its own instance
// callback to finish things off
this.find(category, this.list, [], function(data){
console.log(data);
});
}
}
Hope this helps others that need a way to traverse javascript objects and maintain parent ancestors.

Categories