Strip paragraph text before cue word - javascript

I have a text like this:
Last login: today
cat file
testMachine:root:/root# cat file
File contents
testMachine:root:/root#
And I need to retrieve the information like this:
testMachine:root:/root# cat file
File contents
Stripping away the last line is easy, but the amount of lines I need to remove at start is arbitrary, and I need to remove everything up until the first cue word, which is the machine name, that is known and stored.
I have tried substring() but it strips line by line instead of treating the whole text as one, and removes the host name too, which should remain there. I tried replace() too, but I am not familiar with regex, so the result is a memory exception.
EDIT 1: It seems to be important to note that using a JS for Java engine (In this case I'm using Rhino) means the result isn't the same as you get in web. This was found out after an answer below, which works perfectly on web, doesn't even run on the desktop app.

const text = `
Last login: today
cat file
testMachine:root:/root# cat file
File contents
testMachine:root:/root#`;
const cueWord = "testMachine:root"
const idx = text.indexOf(cueWord);
let restOftheString = text.substring(idx).split("\n");
restOftheString.pop()
console.log(restOftheString.join("\n"))

Related

read/write csv file using javascript in .htm

Intro:
I'm trying to make a html application (.htm) to make some business calculations. The issue that comes is that I need to keep records of everything.
First I found some visual basic scripts to read/write .mdb files, but that was too complicated for me since I have never worked with vbs.
So, I decided to use javascript to read/write .csv file
This is the function I found for reading:
function displayClassList() {
var path="log.csv"
var fso = new ActiveXObject('Scripting.FileSystemObject'),
iStream=fso.OpenTextFile(path, 1, false);
document.getElementById("searchResults").innerHTML="";
while(!iStream.AtEndOfStream) {
var line=iStream.ReadLine();
document.getElementById("searchResults").innerHTML += line + "<br/>";
}
iStream.Close();
}
It works good.
The problem I have is when it comes to writing. I can not append text to a new line in the document. This is the script I got:
var fso = new ActiveXObject("Scripting.FileSystemObject");
var s = fso.CreateTextFile("./ClassList.csv", true);
s.WriteLine("helloworld");
s.Close();
}
The problem with this script is that it replaces all the existing text with "helloworld". What I want is to write "helloworld" in new line. Any solution for that?
Also, is there any way to edit a specific line, like replacing all text in line x?
Here are the scripts for download so that you can test them : http://ge.tt/7u5bDAV2/v/0
If you want to append to the file without overwriting the existing contents, you can use the OpenTextFile method - note that the CreateTextFile method you're using truncates the existing contents.
var fso = new ActiveXObject("Scripting.FileSystemObject");
var s = fso.OpenTextFile("./ClassList.csv", 8);
s.WriteLine("helloworld");
s.Close();
There is no easy way of modifying one line of a text file, unless the modifications you're making leave the line the same length, since otherwise if your changes are shorter you will leave part of the old line unchanged, while if your changes are longer you would overwrite the next line.
More importantly, the FileSystemObject does not support seeking, which you would need in order to jump to a specific line.
If you want to modify one line of the file, your best bet is to:
Open the existing file for reading, and also create a new file for writing
Read the existing file line by line, writing the content you want to keep to the new file
Write your modified line(s) to the new file where needed
Close both files, and rename the new file to replace the old one
Having said that, maybe it would be easier for you if your data file was an HTML or XML document rather than a CSV, since you could then use DOM manipulation functions to read and write it.
Generally you use "\n" in your string to create new lines. It represents the newline character in a JS string.
Here's how "lines" work in text files. It's just a long sequence of characters, and one of the possible characters is the newline character. Whatever program renders the file when you view it just knows to show text after a newline character below any text that was before it. You could split the string you read by the newline character and get an Array representing each line and work with it that way. Then to write you'd join that Array by the newline character and write the resulting string.
Note that some programs require "\r\n" to represent a proper newline and won't render new lines for just a "\n"...so try "\r\n" as the newline if you're having trouble getting newlines to work for the program you use to view the text files.
EDIT: Since you don't seem to believe me I'll just prove it to you with code. I did it with an .hta file, but the concept is the same.
Made a text file "myText.txt" and an .hta file with the code in it. The text file held the contents:
This is line 1
Line 2
the third line
line 4 and stuff
fifth line
Then in my code I made these two functions for easily reading and writing:
function getFile(fname)
{
var opener = new ActiveXObject("Scripting.FileSystemObject");
var pointer = opener.OpenTextFile(fname, 1, true);
var cont = pointer.ReadAll();
pointer.Close();
return cont;
}
function setFile(fname, content)
{
var opener = new ActiveXObject("Scripting.FileSystemObject");
var pointer = opener.OpenTextFile(fname, 2, true);
pointer.WriteLine(content);
pointer.Close();
}
For the programs I was using it uses "\r\n" for the newline. So that's what the example will use. I simply utilize splitting and joining on the string of content to edit whatever line of it I choose:
var content = getFile('myFile.txt'); // read it
var lineArr = content.split('\r\n'); // now we have an array of the file's lines
lineArr[2] = 'NEW LINE CONTENT!'; // editing third line (indexed from 0)
var newContent = lineArr.join('\r\n'); // make it text again with newlines
setFile("myFile.txt", newContent); // write it
Now the text file looks like this:
This is line 1
Line 2
NEW LINE CONTENT!
line 4 and stuff
fifth line
Bam. Editing individual lines in a text format file by understanding how newlines work in text.

Can something help me to see how to deal with single quote escaping in the following scenario

We write js programs for clients which allow them to craft the display text. Here is what we did
We have a raw js file which replaced those strings with tokens, for example
month = [_MonthToken_];
name = '_NameToken_';
and have a xml file to allow user to specify the text like
<xml>
<token name="MonthToken">'Jan','Feb','March'</token>
<token name="NameToken">Alice</token>
</xml>
and have a generator to replace the token with the text and generate the final js file.
month = ['Jan','Feb','March'];
name = 'Alice';
However, I found there is a bug in this scenario. When somebody specifies the name to be "D'Angelo" (for example.) the js will run into a error because the name variable will become
name='D'Angelo'
We have thought of several ways to fix the problem but none of which are perfect.
We may ask our clients to escape the characters, may it seems not appropriate given that they may not know js and there are more cases to escape (", ), which could make them unhappy :|
We also think of changing the generator to escape ', but sometimes the text may be replacing an array, the single quote there should not be escaped. (there are other cases, we may detect it case by case, but it is tedious)
We may have done something wrong for the whole scenario/architecture. but we don't want to change that unless we have confirmed that it is definitely necessary.
So, is there any solution? I will look into every ideas. Thank you in advanced!
(I may also need a better title :P)
I think your xml schema is poor designed, and this is the root cause of your problems.
Basically, you are forcing the author of the xml to put Javascript code inside of the name="MonthToken" element, while you pretend that she can do this without Javascript syntax knowledgement. I guess that you are planning to use eval on the parsed element content to build month and name variables.
The problem you discovered it's not the only one: you also are subject to Javascript code injection: what if a user forge an element such as:
<token name="MonthToken">alert('put some evil instruction here')</token>
I would suggest to change the xml schema in this way:
<xml>
<token name="MonthToken">Jan</token>
<token name="MonthToken">Feb</token>
<token name="MonthToken">March</token>
<token name="NameToken">Alice</token>
</xml>
Then in your generator, you'll have to parse each MonthToken element content, and add it to the month array. Do the same for the name variable.
In this way:
You don't use eval, and so you have no possibility of code injection
Your user doesn't no more have to know how to quote month names
You automatically handle quotes or apostrophe in names, because you are not using them as js code.
If you want month variable to become a string when user enter just a month, then simply transform the variable: with something similar to this:
if (month.length == 1) {
month = month[0];
}

Javascript Bookmarklet Unresponsive

Javascript newb here. Creating a bookmarklet to automate a simple task at work. Mostly a learning exercise. It will scan a transcript on CNN.com, for instance: (http://transcripts.cnn.com/TRANSCRIPTS/1302/28/acd.01.html). It will grab the lead stories at the top of the page, the name and title of the guests on the show, and format them so that they can be copy pasted into another document.
I've come up with a simple version that includes some jQuery that grabs the subheading and then uses a regular expression to find the names of the guests (it will also exclude everything between (begin videoclip) and (end videoclip), but I haven't gotten that far yet. It then alerts them (will eventually print them in a pop-up window, alert is just for troubleshooting purposes).
I'm using http://benalman.com/code/test/jquery-run-code-bookmarklet/ to create the bookmarklet. My problem is that once the bookmarklet is created it is completely unresponsive. Click on it and nothing happens. I've tried minimizing the code first with no result. My guess is that cnn.com's javascript is conflicting with mine but I'm not sure how to get around that. Or do I need to include some code to load and store the text on the current page? Here's the code (I've included comments, but I took these out when I used the bookmarklet generator.) Thanks for any help!
//Grabs the subheading
var leadStories=$(".cnnTransSubHead").text();
//Scans the webpage for guest name and title. Includes a regular expression to find any
//string that starts with a capital letter, includes a comma, and ends in a colon.
var scanForGuests=/[A-Z ].+,[A-Z0-9 ].+:/g;
//Joins the array created by scanForGuests with a semicolon instead of a comma
var guests=scanForGuests.join(‘; ‘);
//Creates an alert in the proper format including stories and guests.
alert(“Lead Stories: “ + leadStories + “. ” + guests + “. SEE TRANSCRIPT FIELD FOR FULL TRANSCRIPT.“)
Go to the page. Open up developer tools (ctrl+shift+j in chrome) and paste your code in the console to see what's wrong.
The $ in var leadStories = $(".cnnTransSubHead").text(); is from jQuery and the link provided does not have jQuery loaded into the page.
On any modern browser you should be able to achieve the same results without jQuery:
var leadStories = document.getElementsByClassName('cnnTransSubHead')
.map(function(el) { return el.innerText } );
next we have:
var scanForGuests=/[A-Z ].+,[A-Z0-9 ].+:/g;
var guests=scanForGuests.join('; ');
scanForGuests IS a regular expression, you never actually matched it to anything - so .join() is going to throw an error. I'm not exactly sure what you're trying to do. Are you trying to scan the full text of the page for that regex? In that case something like this would be your best bet
document.body.innerText.match(scanForGuests);
keep in mind that while innerText removes html markup, it's far from perfect and what pops up in it is very much at the mercy of how the page's html is structured. That said, on my quick test it seems to work.
Finally, for something like this you should use an immediately invoked function or you're sticking all your variables into the global context.
So putting it all together you get something like this:
(function() {
var leadStories = document.getElementsByClassName('cnnTransSubHead')
.map(function(el) { return el.innerText } );
var scanForGuests=/[A-Z ].+,[A-Z0-9 ].+:/g;
var guests = document.body.innerText.match(scanForGuests).join("; ");
alert("Leads: " + leadStories + " Guests: " + guests);
})();

textmate reformat with 2 spaces

I've set textmate to use softtabs 2 spaces on my file. But when I try to reformat the entire document, it uses 2 hard tabs as the indents.
Regular indents work as I want it to, just the document format doesn't. Anyway to get textmate to be obedient?
Thanks.
The JavaScript bundle's "Reformat Document / Selection" command is passing the document's text to the js_beautify function in the bundle's beautify.php file (found on my system and probably by default at /Applications/TextMate.app/Contents/SharedSupport/Bundles/JavaScript.tmbundle/Support/lib/beautify.php). If you take a look at the function definition you'll see that there's a second parameter, $tab_size, with a default value of 4. There's a line in the bundle that reads print js_beautify($input);. Change this to print js_beautify($input, 2); and you should, I expect, get tab stops with two spaces.
To make it a bit more flexible, use the TextMate environment variable TM_TAB_SIZE, as in print js_beautify( $input, getenv('TM_TAB_SIZE' ) );, which should update how the command operates if you ever change your tab size.
Note, I've tested none of this. :) Just took a look at the bundle and tracked down what seems to be necessary.
So, I tried chuck's suggestion and it gave me an error. I did this to "fix it". I'm sure it could be done more elegantly, but this worked for me.
Open up the same file Chuck says to open up, line 50 (or so) should look like this:
function js_beautify($js_source_text, $tab_size = 4)
change $tab_size to 1
function js_beautify($js_source_text, $tab_size = 1)
Now, around line 56 where it says:
$tab_string = str_repeat(' ', $tab_size);
change the space to a tab like so:
$tab_string = str_repeat("\t", $tab_size);
That worked for me.

Problem with regexp in userscript for chrome

This might be a noob question, but I have tried to find an answere here and on other sites and I have still not find the answere. At least not so that I understand enough to fix the problem.
This is used in a userscript for chrome.
I'm trying to select a date from a string. The string is the innerHTML from a tag that I have managed to select. The html structure, and also the string, is something like this: (the div is the selected tag so everything within is the content of the string)
<div id="the_selected_tag">
link
" 2011-02-18 23:02"
thing
</div>
If you have a solution that helps me select the date without this fuzz, it would also be great.
The javascript:
var pattern = /\"\s[\d\s:-]*\"/i;
var tag = document.querySelector('div.the_selected_tag');
var date_str = tag.innerHTML.match(pattern)[0]
When I use this script as ordinary javascript on a html document to test it, it works perfectly, but when I install it as a userscript in chrome, it doesn't find the pattern.
I can't figure out how to get around this problem.
Dump innerHTML into console. If it looks fine then start building regexp from more generic (/\d+/) to more specific ones and output everything into a console. There is a bunch of different quote characters in different encodings, many different types of dashes.
[\d\s:-]* is not a very good choice because it would match " 1", " ". I would rather write something as specific as possible:
/" \d{4}-\d{2}-\d{2} \d{2}:\d{2}"/
(Also document.querySelector('div.the_selected_tag') would return null on your sample but you probably wanted to write class instead of id)
It's much more likely that tag.innerHTML doesn't contain what you think it contains.

Categories