I'm stuck. Stuck because Firestore is not (yet) capable to handle my (to me) relatively simple query. I don't want anything fancy. All I wish to get from the database are the meetings that have not ended yet. But, I would like the list to be limited to 5 items, and ordered by the starting date of the meeting, to display a small "Up next" style Agenda.
However, the following is not a valid query:
where("end", "<", now), orderBy("start", "asc"), limit(5));
So, how do I achieve this rather simple query for my Agenda?
And, while we're here, maybe we can dig into the other queries, too:
:: Display the last meeting (Already Over)
where("end", "<", now), orderBy("end", "desc"), limit(1))
:: Display the current meeting (Now) - Started, but not Ended.
.... ?????
:: Display the meetings which have not yet started (Next)
where("start", ">", now), orderBy("start", "asc"), limit(5))
The only thing that I can think of right now for the "current", is to grab the array of all the meetings that have ended. Grab the array of all the future meetings, and an array of all meetings. Substract the (previous) and (future) arrays from the (all) list, and I'll have the one meeting that hasn't ended, but has already started. Theres gotta be a more efficient way to do this. No?
The common approach for this is to define buckets, and then assign each meeting to the relevant buckets.
For example, say that you show the events that are going on today, and then style the events that have already started/ended differently.
In such a scenario, you could have an array of meetings_days that contains the days that the meeting is active: meeting_days: ['2022-12-01', '2022-12-02', '2022-12-03', '2022-12-04', '2022-12-05'].
Now you can use an array-contains filter to select the events for a given day, and then determine the styling in your client-side application code. If you don't show all events (e.g. not ones that have already finished), determining the right bucket size to limit overreads while keeping the data size reasonable.
An alternative data model could be a may of the meeting days: meeting_days: { '2022-12-01': true, '2022-12-02': true, '2022-12-03': true, '2022-12-04': true, '2022-12-05': true }. Now you can do an AND type query, like finding only events that run the first 2 days of December.
The correct data model here depends on the use-cases of your app, and will likely change/evolve as your app evolves.
Related
I'm currently using a platform to store orders that changes through time. I'm using Prometheus to also scrape when an order is created or when it changes its status. Find an example of the object below.
order = {id: <id>, status : <ORDERED, LOADED, DELIVERED>, time: <time>}
In order to save the order into Prometheus, I'm doing the following.
prom_order.inc(order)
I'm currently using a counter. However, after some a while, the metrics API keeps tracks of very old records. Also, Prometheus will save orders with the same id and the different statues. So, if the order went from ORDERED to DELIVERED, it will appear 3 different times. I'm wondering if there is a better metric to use for this case. Probably a metric that only preserves the last state? Maybe a metric that goes to zero ? Is there a metric that can be reset when is no longer needed? Is it possible maybe to delete or decrease a metric based on one of the labelNames?
I think it will better to remove the label "status" (ORDERED, LOADED, DELIVERED) and use the value of the metric to indicate the status:
0 = ORDERED
1 = LOADED
2 = DELIVERED
BTW: Do you have a label called "time"? You could use the "timestamp" for this, couldn't you?
I have the following data structure which describes an object and the time period which it's valid. Pretend the numbers below are unix timestamps.
{
"id": 1234,
"valid_from": 2000
"valid_to": 4000
},
{
"id": 1235,
"valid_from": 1000,
"valid_to": 2200,
}
...
I want to quickly be able to store these items in JavaScript and then query for items which are valid at a certain time.
For example if I were to query for objects valid at 2100 I would get [1234, 1235]. If I were to query for objects valid at 3999 I would get [1234], and at 4999 nothing.
I will have a size of about 50-100k items in the structure and I'd like fast lookup speeds but inserts, and deletes could be slower.
Items will have duplicate valid_from and valid_to values so it needs to support duplicates. Items will overlap.
I will need to be continually inserting data into the structure (probably by bulk for initial load, and then one off updates as data changes). I will also be periodically modifying records so likely a remove and insert.
I am not sure what the best approach to this is in a highly efficient manner?
Algorithms are not my strong suit but if I just know the correct approach I can research the algorithms themselves.
My Idea:
I was originally thinking a modified binary search tree to support duplicate keys and closest lookup, but this would only allow me to query objects that are > valid_from or < valid_to.
This would involve me bisecting the array or tree to find all items > valid_from and then manually checking each one for valid_to.
I suppose I could have two search trees, one for valid_to and valid_from, then I could check which id's from the results overlap and return those id's?
This still seems kind of hacky to me? Is there a better approach someone can recommend or is this how it's done.
Imagine you have two lists: initiation/begin and expiration/end. Both are sorted by TIME.
Given a particular time, you can find where in each list the first item is that meets the criteria by binary search. You can also do inserts by binary search into each list.
For example, if there are 1000 items and begin location is 342, then items 1-342 are possible, and if end location is 901, then items 901-1000 in the termination list are possible. You now need to intersect both sublists.
Take item IDs from 1-342 in begin and 901-1000 in end, and put them in a separate array (allocated ahead of time). Sort the array. Traverse the array. Whenever the same ID appears twice in a row, it is a hit, a valid match.
I've recently started using Interactive Reports in my Oracle APEX application. Previously, all pages in the application used Classic Reports. The Interactive Report in my new page works great, but, now, I'd like to add a summary box/table above the Interactive Report on the same page that displays the summed values of some of the columns in the Interactive Report. In other words, if my Interactive Report displays 3 distinct manager names, 2 distinct office locations, and 5 different employees, my summary box would contain one row and three columns with the numbers, 3, 2, and 5, respectively.
So far, I have made this work by creating the summary box as a Classic Report that counts distinct values for each column in the same table that my Interactive Report pulls from. The problem arises when I try to filter my interactive report. Obviously, the classic report doesn't refresh based on the interactive report filters, but I don't know how I could link the two so that the classic report responds to the filters from the interactive report. Based on my research, there are ways to reference the value in the Interactive Report's search box using javascript/jquery. If possible, I'd like to reference the value from the interactive table's filter with javascript or jquery in order to refresh the summary box each time a new filter is applied. Does anyone know how to do this?
Don't do javascript parsing on the filters. It's a bad idea - just think on how you would implement this? There's massive amounts of coding to be done and plenty of ajax. And with apex 5 literally around the corner, where does it leave you when the APIs and markup are about to change drastically?
Don't just give in to a requirement either. First make sure how feasible it is technically. And if it's not, make sure you make it abundantly clear what the implications are in regard of time consumption. What is the real value to be had by having these distinct value counts? Maybe there is another way to achieve what they want? Maybe this is nothing more than an attempted solution, and not the core of the real problem. Stuff to think about...
Having said that, here are 2 options:
First method: Count Distinct Aggregates on Interactive reports
You can add these to the IR through the Actions button.
Note though, that this aggregate will be THE LAST ROW! In the example I've posted here, reducing the rows per page to 5 would push the aggregate row to the pagination set 3!
Second Method: APEX_IR and DBMS_SQL
You could use the apex_ir API to retrieve the IR's query and then use that to do a count.
(Apex 4.2) APEX_IR.GET_REPORT
(Apex 5.0) APEX_IR.GET_REPORT
Some pointers:
Retrieve the region ID by querying apex_application_page_regions
Make sure your source query DOES NOT contain #...# substitution strings. (such as #OWNER#.)
Then get the report SQL, rewrite it, and execute it. Eg:
DECLARE
l_report apex_ir.t_report;
l_query varchar2(32767);
l_statement varchar2(32000);
l_cursor integer;
l_rows number;
l_deptno number;
l_mgr number;
BEGIN
l_report := APEX_IR.GET_REPORT (
p_page_id => 30,
p_region_id => 63612660707108658284,
p_report_id => null);
l_query := l_report.sql_query;
sys.htp.prn('Statement = '||l_report.sql_query);
for i in 1..l_report.binds.count
loop
sys.htp.prn(i||'. '||l_report.binds(i).name||' = '||l_report.binds(i).value);
end loop;
l_statement := 'select count (distinct deptno), count(distinct mgr) from ('||l_report.sql_query||')';
sys.htp.prn('statement rewrite: '||l_statement);
l_cursor := dbms_sql.open_cursor;
dbms_sql.parse(l_cursor, l_statement, dbms_sql.native);
for i in 1..l_report.binds.count
loop
dbms_sql.bind_variable(l_cursor, l_report.binds(i).name, l_report.binds(i).value);
end loop;
dbms_sql.define_column(l_cursor, 1, l_deptno);
dbms_sql.define_column(l_cursor, 2, l_mgr);
l_rows := dbms_sql.execute_and_fetch(l_cursor);
dbms_sql.column_value(l_cursor, 1, l_deptno);
dbms_sql.column_value(l_cursor, 2, l_mgr);
dbms_sql.close_cursor(l_cursor);
sys.htp.prn('Distinct deptno: '||l_deptno);
sys.htp.prn('Distinct mgr: '||l_mgr);
EXCEPTION WHEN OTHERS THEN
IF DBMS_SQL.IS_OPEN(l_cursor) THEN
DBMS_SQL.CLOSE_CURSOR(l_cursor);
END IF;
RAISE;
END;
I threw together the sample code from apex_ir.get_report and dbms_sql .
Oracle 11gR2 DBMS_SQL reference
Some serious caveats though: the column list is tricky. If a user has control of all columns and can remove some, those columns will disappear from the select list. Eg in my sample, letting the user hide the DEPTNO column would crash the entire code, because I'd still be doing a count of this column even though it will be gone from the inner query. You could block this by not letting the user control this, or by first parsing the statement etc...
Good luck.
I have a CouchDB database with the following type of documents, representing events that happen within a building:
{ person: 1
timestamp: 1
event: { type: enter
room: b }
}
and
{ person: 2
timestamp: 5
event: { type: leave
room: b }
}
The problem that I want to solve is the following: I want to know the total amount of time that every other person spent in the same room as person 1. Note that any person can enter and leave many rooms at many different times. I honestly don't know whether MapReduce is the best paradigm for this, or if I should just export my data and write a separate script to figure this stuff out (although this is probably not a feasible solution for our production environment).
As a starting solution lets assume that all the data is sane, and thus someone entering a room will also leave that room at a later time. However, in a final solution this requirement will probably have to be relaxed, because some events may be missing.
I have thought of a potential solution, but I have no idea whether this is at all possible or how to do this in couchdb. Here is an outline.
Create a view that emits the following format, for every person entering a room event:
{ [room, person, timestamp], null }
Create a view that emits { [room, timestamp], null} for every time person 1 exits the room (could be for all people, but is unnecessary).
Create a view that for each exiting a room event for any person except person 1, does the following. In the mapping step:
Queries the first view to find the last timestamp when that person entered the room.
Queries the first view to find all times before the exiting the room event that person 1 entered that room
For each of those times, queries the second view to find all exit times for that room, and for each interval checks what the overlap is.
Sum these overlaps together and emit as { person, time }
Reduce:
for every person, sum all the times together.
However, this relies on me being able to figure out how to query a different view from within a view. Does anybody know if that is possible, and if so, how?
The only way I have found for doing this within the CouchDB structure is by using a list function. I create a view that simply emits all documents with [building, timestamp] as key. This helps me query the view to ensure I have a single day and a single building, with startkey and endkey.
I now create a list function that simply takes all the documents returned by the view, and performs the processing in a javascript function. This bypasses the map-reduce framework for the most part, but was the only way I could think of doing this within the CouchDB framework. Obviously the same could be done with any other script instead of the list function, using CouchDB's RESTful API.
On our web application, the search results are displayed in sortable tables. The user can click on any column and sort the result. The problem is some times, the user does a broad search and gets a lot of data returned. To make the sortable part work, you probably need all the results, which takes a long time. Or I can retrieve few results at a time, but then sorting won't really work well. What's the best practice to display sortable tables that might contain lots of data?
Thanks for all the advises. I will certainly going over these.
We are using an existing Javascript framework that has the sortable table; "lots" of results means hundreds. The problem is that our users are at some remote site and a lot of delay is the network time to send/receive data from the data center. Sorting the data at the database side and only send one page worth of results at a time is nice; but when the user clicks some column header, another round trip is done, which always add 3-4 seconds.
Well, I guess that might be the network team's problem :)
Using sorting paging at the database level is the correct answer. If your query returns 1000 rows, but you're only going to show the user 10 of them, there is no need for the other 990 to be sent across the network.
Here is a mysql example. Say you need 10 rows, 21-30, from the 'people' table:
SELECT * FROM people LIMIT 21, 10
You should be doing paging back on the database server. E.g. on SQL 2005 and SQL 2008 there are paging techniques. I'd suggest looking at paging options for whatever system you're looking at.
What database are you using as there some good paging option in SQL 2005 and upwards using ROW_NUMBER to allow you to do paging on the server. I found this good one on Christian Darie's blog
eg This procedure which is used to page products in a category. You just pass in the pagenumber you want and the number of products on the page etc
CREATE PROCEDURE GetProductsInCategory
(#CategoryID INT,
#DescriptionLength INT,
#PageNumber INT,
#ProductsPerPage INT,
#HowManyProducts INT OUTPUT)
AS
-- declare a new TABLE variable
DECLARE #Products TABLE
(RowNumber INT,
ProductID INT,
Name VARCHAR(50),
Description VARCHAR(5000),
Price MONEY,
Image1FileName VARCHAR(50),
Image2FileName VARCHAR(50),
OnDepartmentPromotion BIT,
OnCatalogPromotion BIT)
-- populate the table variable with the complete list of products
INSERT INTO #Products
SELECT ROW_NUMBER() OVER (ORDER BY Product.ProductID),
Product.ProductID, Name,
SUBSTRING(Description, 1, #DescriptionLength) + '...' AS Description,
Price, Image1FileName, Image2FileName, OnDepartmentPromotion, OnCatalogPromotion
FROM Product INNER JOIN ProductCategory
ON Product.ProductID = ProductCategory.ProductID
WHERE ProductCategory.CategoryID = #CategoryID
-- return the total number of products using an OUTPUT variable
SELECT #HowManyProducts = COUNT(ProductID) FROM #Products
-- extract the requested page of products
SELECT ProductID, Name, Description, Price, Image1FileName,
Image2FileName, OnDepartmentPromotion, OnCatalogPromotion
FROM #Products
WHERE RowNumber > (#PageNumber - 1) * #ProductsPerPage
AND RowNumber <= #PageNumber * #ProductsPerPage
You could do the sorting on the server. AJAX would eliminate the necessity of a full refresh, but there'd still be a delay. Sides, databases a generally very fast at sorting.
For these situations I employ techniques on the SQL Server side that not only leverage the database for the sorting, but also use custom paging to ONLY return the specific records needed.
It is a bit of a pain to implemement at first, but the performance is amazing afterwards!
How large is "a lot" of data? Hundreds of rows? Thousands?
Sorting can be done via JavaScript painlessly with Mochikit Sortable Tables. However, if the data takes a long time to sort (most likely a second or two [or three!]) then you may want to give the user some visual cue that soming is happening and the page didn't just freeze. For example, tint the screen (a la Lightbox) and display a "sorting" animation or text.