Write concern in mongodb shell javascript forEach function

Write concern in mongodb shell javascript forEach function - javascript

The mongo shell defaults to safe writes, which from my understanding happens at the end of every carriage return. What if you have code in a loop like this:
db.coll1.find().forEach(function(doc){
db.coll2.update({"blah": doc._id}, {$set: {"blahblah": doc.value}});
});
Does the db.getLastError() occur for every single update or only at the very end of the for loop on the last update? Or does it happen at the end of the for loop for every updated document, all at one time?

The shell actually has w:1 (safe writes) when in interactive mode, when running in a loop it will not call getLastError until the end.
As reference you can actually see this comment by #Asya who works for MongoDB Inc.
the shell uses safe in that it called getLastError after every "command" (i.e. carriage return). If you are writing data, say, in a loop then GLE will only be called once at the end. Provide more details about how you plan to populate collection from the shell - maybe the right thing will already happen
Setting MongoDB's write concern in shell / shell script

Adding to that answer from Sammaye, if you wanted to call GLE after each update call you would do something like this:
db.coll1.find().forEach(function(doc){
db.coll2.update({"blah": doc._id}, {$set: {"blahblah": doc.value}}, true);
if(db.getLastError(1)){
printjson(db.getLastError(1));
printjson("failed to update ID: "+doc._id)
};
});
You'll get a dupe error at the end of the loop because of the carriage return if the last op fails, but otherwise it's pretty much what you would expect. If you want an easy test to reproduce just set the value field to be a string, then use $inc instead of $set - that will fail on non-numerics.

Related

JMeter reusing previous thread payload instead of new thread payload

I am trying to adapt a script I already have to run using .csv data input. When the script is ran without the .csv, it runs perfectly for any configurations I choose to use. When it runs using the .csv, whatever scenario is in the first row will run perfect, but everything from there on will fail. The reason for the failure is because some of my variables are being reused from the first thread and I don't know how to stop this from happening.
This is what my script looks like:
HTTP Request - GET ${url} (url is declared in the CSV data input, and changes each run)
-> postprocessor that extracts Variable_1, Variable_2 and Variable_3
Sampler1
-> JSR223 preprocessor: creates payloadSampler1 using javascript, example:
var payloadSampler1 = { };
payloadSampler1.age = vars.get("Variable_2");
payloadSampler1.birthDate = "1980-01-01";
payloadSampler1.phone = {};
payloadSampler1.phone.number = "555-555-5555";
vars.put("payloadSampler1", JSON.stringify(payloadSampler1));
Sampler2
-> JSR223 preprocessor: creates payloadSampler1 using javascript (same as above but for different values)
Sampler3
-> JSR223 preprocessor: creates payloadSampler1 using javascript (same as above but for different values)
Sampler4
-> JSR223 preprocessor: creates payloadSampler1 using javascript (same as above but for different values)
HTTP Request - POST ${url}/${Variable_1}/submit
-> JSR223 preprocessor: creates payloadSubmit using javascript, and mix and matching the results from the above samplers - like so:
var payloadSubmit = { };
if (vars.get("someVar") != "value" && vars.get("someVar") != "value2" && vars.get("differentVar") != "true") {
payloadSubmit.ageInfo = [${payloadSampler1}];
}
if (vars.get("someVar2") != "true") {
payloadSubmit.paymentInfo = [${payloadSampler2}];
}
payloadSubmit.emailInfo = [${payloadSampler3"}];
payloadSubmit.country = vars.get("Variable_3");
vars.put("payloadSubmit", JSON.stringify(payloadSubmit));
-> BodyData as shown in the screenshot:
request
I have a Debug PostProcessor to see the values of all these variables I am creating. For the first iteration of my script, everything is perfect. For the second one, however, the Debug PostProcessor shows the values for all payloadSamplers and all the Variables correctly changed to match the new row data (from the csv), but, the final variable, payloadSubmit just reuses whatever the values where for the first thread iteration.
Example:
Debug PostProcessor at the end of first iteration shows:
Variable_1=ABC
Variable_2=DEF
Variable_3=GHI
payloadSampler1={"age":"18","email":null,"name":{"firstName":"Charles"}},{"age":"38","email":null}}
payloadSampler2={"paymentChoice":{"cardType":"CreditCard","cardSubType":"VI"}},"amount":"9.99","currency":"USD"}
payloadSampler3={"email":"tes#email.com"}
payloadSubmit={"ageInfo":[{"age":"18","email":null,"name":{"firstName":"Charles"}},{"age":"38","email":null}],"paymentInfo":[{"paymentChoice":{"cardType":"CreditCard","cardSubType":"VI"}},"amount":"9.99","currency":"USD"],"emailInfo":[{"email":"tes#email.com"}],"country":"GHI"}
But at the end of the 2nd iteration it shows:
Variable_1=123
Variable_2=456
Variable_3=789
payloadSampler1={"age":"95","email":null,"name":{"firstName":"Sam"}},{"age":"12","email":null}}
payloadSampler2={"paymentChoice":{"cardType":"CreditCard","cardSubType":"DC"}},"amount":"19.99","currency":"USD"}
payloadSampler3={"email":"tes2#email.com"}
payloadSubmit={"ageInfo":[{"age":"18","email":null,"name":{"firstName":"Charles"}},{"age":"38","email":null}],"paymentInfo":[{"paymentChoice":{"cardType":"CreditCard","cardSubType":"VI"}},"amount":"9.99","currency":"USD"],"emailInfo":[{"email":"tes#email.com"}],"country":"USA"}
I can also see that the final HTTP Request is indeed sending the old values.
My very limited understanding is that because I am invoking the variables like so "${payloadSampler1}" it will use the value that was set for that the first time the sampler was ran (back in the 1st thread iteration). These are the things I have tried:
If I use vars.get("payloadSubmit") on the body of an HTTP Sampler, I get an error, so that is not an option. If I use vars.get("payloadSampler1") on the Samplers that create the variables, extra escape characters are added, which breaks my JSON. I have tried adding a counter to the end of the variable name and having that counter increase on each thread iteration, but the results is the same. All the variables and samplers other than the last one have updated values, but the last one will always reuse the variables from the first thread iteration.
I also tried to use ${__javaScript(vars.get("payloadSubmit_"+vars.get("ThreadIteration")))}, but the results are always the same.
And I have also tried using the ${__counter(,)} element, but if I set it to TRUE, it will always be 1 for each thread iteration, and if I set it to FALSE, it starts at 2 (I am assuming it is because I use counter in another sampler within this thread - but even after removing that counter this still happens).
I am obviously doing something (or many things) wrong.
If anyone can spot what my mistakes are, I would really appreciate hearing your thoughts. Or even being pointed to some resource I can read for an approach I can use for this. My knowledge of both javascript and jmeter is not great, so I am always open to learn more and correct my mistakes.
Finally, thanks a lot for reading through this wall of text and trying to make sense of it.

It's hard to tell where exactly your problem is without seeing the values of these someVar and payload, most probably something cannot be parsed as a valid JSON therefore on 2nd iteration your last JSR223 PostProcessor fails to run till the end and as the result your payloadSubmit variable value doesn't get updated. Take a closer look at JMeter GUI, there is an yellow triangle with exclamation sign there which indicates the number of errors in your scripts. Also it opens JMeter Log Viewer on click
if there is a red number next to the triangle - obviously you have a problem and you will need to see the jmeter.log file for the details.
Since JMeter 3.1 it is recommended to use Groovy language for any form of scripting mainly due to the fact that Groovy has higher performance comparing to other scripting options. Check out Parsing and producing JSON guide to learn more on how to work with JSON data in Groovy.

Jquery .push gives weird output: '[[[[{....}]'

I am developing a salesreport system which is heavily relying on JSON communication. I have a script that records client visits into a Javascript Object. Which works fine, apparently.
salesReport = [];
...
salesReport.push({
"nr": visitCounter,
"kto": ActiveAccount,
"dok": dokName
});
Each time a visit is logged the push function is activated.
Onthe first run I get the expected result:
[{"nr":1,"kto":"52803","dok":""}]
But when I push again, I get this result:
[[[[[{"nr":1,"kto":"52803","dok":""}],{"nr":2,"kto":"52350","dok":""}], {"nr":3,"kto":"52539","dok":""}],{"nr":4,"kto":"50869","dok":""}],{"nr":5,"kto":"52135","dok":""}]
The '[' brackets are added at the beginning of the output, and at the end of each post. Why is that ?
Shouldn't the '[' and ']' only be added at the beginning and at the end? and then only one time ?

So, it seems there was an idiotic error in another script.
At the end of each session the visitor log is stored in a local storage file. Which is then read back into the javascript Object if another session is started the same day.
The problem was that I had used the .push function to read the "old" data back into the object. Thus creating a double push of sorts which led the system to think that it was all one entry instead of several entries.
So in the end it was "my bad".
logging this in case someone else in the future might experience the same thing.

How much is JavaScript with Node.js asynchronous?

I know that node.js is asynchronous, but what I don't really understand is how much.
Example:
If a need to do 3 things in sequence, and I need every thing is done before the other begins, Do i must need to use Callback?
var myContainer;
for(var i=0;i<10000;i++)
myContainer.push(i.toString());
for(var j=0;j<myContainer.length;j++)
console.log(myContainer[j]);
for(var x=0;x<myContainer.length;x++)
myModuleForEmails.sendEmailsTo(myContainer[x]);
Ok, suppose for one moment I have a module like Imap ready and calling myModuleForEmails.sendEmailsTo(myContainer[x]) I really send the email.
I know also that this program is absolutely useless, but it's just to understand.
Suppose I must push all 10000 string in myContainer, only THEN log all the string that are in myContainer in the console, and only AFTER BOTH I need to send the emails.
Is this version reliable or do I need 2 callback? And Does the number of iteration I do matter? i.e, if i had used 10 instead of 10000, could I have used this syntax because it takes so few to do 10 operation that finishes the first for cycle before starting the second?

Improving Twitter's typeahead.js performance with remote data using Django

I have a database with roughly 1.2M names. I'm using Twitter's typeahead.js to remotely fetch the autocomplete suggestions when you type someone's name. In my local environment this takes roughly 1-2 seconds for the results to appear after you stop typing (the autocomplete doesn't appear while you are typing), and 2-5+ seconds on the deployed app on Heroku (using only 1 dyno).
I'm wondering if the reason why it only shows the suggestions after you stop typing (and a few seconds delay) is because my code isn't as optimized?
The script on the page:
<script type="text/javascript">
$(document).ready(function() {
$("#navPersonSearch").typeahead({
name: 'people',
remote: 'name_autocomplete/?q=%QUERY'
})
.keydown(function(e) {
if (e.keyCode === 13) {
$("form").trigger('submit');
}
});
});
</script>
The keydown snippet is because without it my form doesn't submit for some reason when pushing enter.
my django view:
def name_autocomplete(request):
query = request.GET.get('q','')
if(len(query) > 0):
results = Person.objects.filter(short__istartswith=query)
result_list = []
for item in results:
result_list.append(item.short)
else:
result_list = []
response_text = json.dumps(result_list, separators=(',',':'))
return HttpResponse(response_text, content_type="application/json")
The short field in my Person model is also indexed. Is there a way to improve the performance of my typeahead?

I don't think this is directly related Django, but I may be wrong. I can offer some generic advice for this kind of situations:
(My money is on #4 or #5 below).
1) What is an average "ping" from your machine to Heroku? If it's far, that's a little bit extra overhead. Not much, though. Certainly not much when compared to then 8-9 seconds you are referring to. The penalty will be larger with https, mind you.
2) Check the value of waitLimitFn and rateLimitWait in your remote dataset. Are they the default?
3) In all likelyhood, the problem is database/dataset related. First thing to check is how long it takes you to establish a connection to the database (do you use a connection pool?).
4) Second thing: how long it takes to run the query. My bet is on this point or the next. Add debug prints, or use NewRelic (even the free plan is OK). Have a look at the generated query and make sure it is indexed. Have your DB "explain" the execution plan for such a query and make it is uses the index.
5) Third thing: are the results large? If, for example, you specify "J" as the query, I imagine there will be lots of answers. Just getting them and streaming them to the client will take time. In such cases:
5.1) Specify a minLength for your dataset. Make it at least 3, if not 4.
5.2) Limit the result set that your DB query returns. Make it return no more than 10, say.
6) I am no Django expert, but make sure the way you use your model in Django doesn't make it load the entire table into memory first. Just sayin'.
HTH.

results = Person.objects.filter(short__istartswith=query)
result_list = []
for item in results:
result_list.append(item.short)
Probably not the only cause of your slowness but this horrible from a performance point of view: never loop over a django queryset. To assemble a list from a django queryset you should always use values_list. In this specific case:
results = Person.objects.filter(short__istartswith=query)
result_list = results.values_list('short', flat=True)
This way you are getting the single field you need straight from the db instead of: getting all the table row, creating a Person instance from it and finally reading the single attribute from it.

Nitzan covered a lot of the main points that would improve performance, but unlike him I think this might be directly related to Django (at at least, sever side).
A quick way to test this would be to update your name_autocomplete method to simply return 10 random generated strings in the format that Typeahead expects. (The reason we want them random is so that Typeahead's caching doesn't skew any results).
What I suspect you will see is that Typeahead is now running pretty quick and you should start seeing results appear as soon as your minLength of string has been typed.
If that is the case then we will need to into what could be slowing the query up, my Python skills are non-existent so I can't help you there sorry!
If that isn't the case then I would maybe consider doing some logging of when $('#navPersonSearch') calls typeahead:initialized and typeahead:opened to see if they bring up anything odd.

You can use django haystack, and your server side code would be roughly like:
def autocomplete(request):
sqs = SearchQuerySet().filter(content_auto=request.GET.get('q', ''))[:5] # or how many names you need
suggestions = [result.first_name for result in sqs]
# you have to configure typeahead how to process returned data, this is a simple example
data = json.dumps({'q': suggestions})
return HttpResponse(data, content_type='application/json')

MongoDB shell: printing to console without a trailing newline?

Is there a way to write to STDOUT without a trailing newline from the Mongo shell? I can't seem to find anything other than print() available.

This is related to my SO question on reading a line from the console. Per #Stennie's comment, it is not possible in the current (2.0.6) version of the Mongo shell.

There might be ways to work around it. You can accumulate the results in an intermediate variable (could be an array, string or any other data structure), then print the entire thing in a single line. Below example illustrates use of an array to capture values from query results, then array is converted to string with comma as a separator. In my case I'm interested in just the _id field:
var cursor = db.getCollection('<collection name>').find(<your query goes here>)
let values = []
cursor.forEach((doc) => values.push(doc._id))
print(values.join(','))
Depending on how many results you're expecting, not sure if space consumed by the intermediate data structure might overwhelm memory. If that's the case can craft the query to return smaller, subsets of data that when added together comprise the full result set you're going for.

This is quite old question, however still relevant, so answering.
One can use printjsononeline().

We Keep Coding

JavaScript is the programming language of the Web.