deObfuscating in Python using transformed JS function - javascript

I needed to convert the following function to python to deobfuscate a text extracted while web scraping:
function obfuscateText(coded, key) {
// Email obfuscator script 2.1 by Tim Williams, University of Arizona
// Random encryption key feature by Andrew Moulden, Site Engineering Ltd
// This code is freeware provided these four comment lines remain intact
// A wizard to generate this code is at http://www.jottings.com/obfuscator/
shift = coded.length
link = ""
for (i = 0; i < coded.length; i++) {
if (key.indexOf(coded.charAt(i)) == -1) {
ltr = coded.charAt(i)
link += (ltr)
}
else {
ltr = (key.indexOf(coded.charAt(i)) - shift + key.length) % key.length
link += (key.charAt(ltr))
}
}
document.write("<a href='mailto:" + link + "'>" + link + "</a>")
}"""
here is my converted python equivalent:
def obfuscateText(coded,key):
shift = len(coded)
link = ""
for i in range(0,len(coded)):
inkey=key.index(coded[i]) if coded[i] in key else None
if ( not inkey):
ltr = coded[i]
link += ltr
else:
ltr = (key.index(coded[i]) - shift + len(key)) % len(key)
link += key[ltr]
return link
print obfuscateText("uw#287u##Guw#287Xw8Iwu!#W7L#", "WXYVZabUcdTefgShiRjklQmnoPpqOrstNuvMwxyLz01K23J456I789H.#G!#$F%&E'*+D-/=C?^B_`{A|}~")
actionattraction$comcastWnet
but I am getting a slightly incorrect output instead of actionattraction#comcast.net I get above. Also many a times the above code gives random characters for the same html page,
The target html page has a obfuscateText function in JS with the coded and key, I extract the function signature in obsfunc and execute it on the fly:
email=eval(obsfunc)
which stores the email in above variable, but the problem is that it works most of the time but fails certain times , I strongly feel that the problem is with the arguments supplied to the python function , they may need escaping or conversion as it contains special characters? I tried passing raw arguments and different castings like repr() but the problem persisted.
Some examples for actionattraction#comcast.net wrongly computed and correctly computed using the same python function(first line is email):
#ation#ttr#ationVaoma#st!nct
obfuscateText("KMd%Y#Kdd8KMd%Y#IMY!MKcdJ#*d", "utvsrwqxpyonzm0l1k2ji3h4g5fe6d7c8b9aZ.Y#X!WV#U$T%S&RQ'P*O+NM-L/K=J?IH^G_F`ED{C|B}A~")
}ction}ttr}ction#comc}st.net
obfuscateText("}ARGML}RRP}ARGMLjAMKA}QRiLCR", "}|{`_^?=/-+*'&%$#!#.9876543210zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA~")
actionattraction#comcast.net
obfuscateText("DEWLRQDWWUDEWLRQoERPEDVWnQHW", "%&$#!#.'9876*54321+0zyxw-vutsr/qponm=lkjih?gfed^cbaZY_XWVUT`SRQPO{NMLKJ|IHGFE}DCBA~")

I've rewritten the deobfuscator:
def deobfuscate_text(coded, key):
offset = (len(key) - len(coded)) % len(key)
shifted_key = key[offset:] + key[:offset]
lookup = dict(zip(key, shifted_key))
return "".join(lookup.get(ch, ch) for ch in coded)
and tested it as
tests = [
("KMd%Y#Kdd8KMd%Y#IMY!MKcdJ#*d", "utvsrwqxpyonzm0l1k2ji3h4g5fe6d7c8b9aZ.Y#X!WV#U$T%S&RQ'P*O+NM-L/K=J?IH^G_F`ED{C|B}A~"),
("}ARGML}RRP}ARGMLjAMKA}QRiLCR", "}|{`_^?=/-+*'&%$#!#.9876543210zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA~"),
("DEWLRQDWWUDEWLRQoERPEDVWnQHW", "%&$#!#.'9876*54321+0zyxw-vutsr/qponm=lkjih?gfed^cbaZY_XWVUT`SRQPO{NMLKJ|IHGFE}DCBA~"),
("ZUhq4uh#e4Om.04O", "ksYSozqUyFOx9uKvQa2P4lEBhMRGC8g6jZXiDwV5eJcAp7rIHL31bnTWmN0dft")
]
for coded,key in tests:
print(deobfuscate_text(coded, key))
which gives
actionattraction#comcast.net
actionattraction#comcast.net
actionattraction#comcast.net
anybody#home.com
Note that all three key strings contain &; replacing it with & fixes the problem. Presumably at some point the javascript was mistakenly html-code-escaped; Python has a module which will unencode html special characters like so:
# Python 2.x:
import HTMLParser
html_parser = HTMLParser.HTMLParser()
unescaped = html_parser.unescape(my_string)
# Python 3.x:
import html.parser
html_parser = html.parser.HTMLParser()
unescaped = html_parser.unescape(my_string)

First of all, index doesn't return None, but throws an exception. In your case, W appears instead of a dot because the index returned is 0, and not inkey (which is also wrong) mistakenly beleive that a character is not present in the key.
Second, presence of & suggests that you indeed may have to find and decode HTML entities.
Finally, I'd recommend to rewrite it like
len0 = len(code)
len1 = len(key)
link = ''
for ch in code:
try:
ch = key[(key.index(ch) - len0 + len1) % len1]
except ValueError: pass
link += ch
return link

Related

Minifying javascript via unicode

on dwitter.net i often see dweets that are encoded interestingly to minify the JS to character count.
for example https://www.dwitter.net/d/22372 (or https://www.dwitter.net/d/11506)
eval(unescape(escape`𮀮𩡯𫡴🐧𜡥𫐠𨐧𛁸𛡦𪑬𫁔𩑸𭀨𙱜𭐲𝠲𜀠𙰬𜰬𜠵𚐊𭀿𜀺𩀽𮀮𩱥𭁉𫑡𩱥𡁡𭁡𚀰𛀰𛁶🐳𝠬𭠩𛡤𨑴𨐊𩡯𬠨𨰮𭱩𩁴𪁼👷👩🐹𜰶𞱩𛐭𞰩𩐽𪐥𭠪𝠬𩁛𪐪𝀫𜱝🠵𜁼𯁸𛡦𪑬𫁒𩑣𭀨𦀽𩐫𩐯𜠪𤰨𭀭𪐯𭰩𚱷𛁩𛰳𛑥𚡃𚁴𛑘𛰹𞐩𚱥𚰵𜀬𞐬𪐼𜐿𭰺𞐩`.replace(/u../g,'')))
Now I understand how to decode this and read the javascript, it's pretty trivial
unescape(escape`𮀮𩡯𫡴🐧𜡥𫐠𨐧𛁸𛡦𪑬𫁔𩑸𭀨𙱜𭐲𝠲𜀠𙰬𜰬𜠵𚐊𭀿𜀺𩀽𮀮𩱥𭁉𫑡𩱥𡁡𭁡𚀰𛀰𛁶🐳𝠬𭠩𛡤𨑴𨐊𩡯𬠨𨰮𭱩𩁴𪁼👷👩🐹𜰶𞱩𛐭𞰩𩐽𪐥𭠪𝠬𩁛𪐪𝀫𜱝🠵𜁼𯁸𛡦𪑬𫁒𩑣𭀨𦀽𩐫𩐯𜠪𤰨𭀭𪐯𭰩𚱷𛁩𛰳𛑥𚡃𚁴𛑘𛰹𞐩𚱥𚰵𜀬𞐬𪐼𜐿𭰺𞐩`.replace(/u../g,''))
returns
x.font='2em a',x.fillText('\u2620 ',3,25)
t?0:d=x.getImageData(0,0,v=36,v).data
for(c.width|=w=i=936;i--;)e=i%v*6,d[i*4+3]>50||x.fillRect(X=e+e/2*S(t-i/w)+w,i/3-e*C(t-X/99)+e+50,9,i<1?w:9)
but what I don't understand is how to encode js like this.
I noticed there is an intermediary step in this process
running:
escape`𮀮𩡯𫡴🐧𜡥𫐠𨐧𛁸𛡦𪑬𫁔𩑸𭀨𙱜𭐲𝠲𜀠𙰬𜰬𜠵𚐊𭀿𜀺𩀽𮀮𩱥𭁉𫑡𩱥𡁡𭁡𚀰𛀰𛁶🐳𝠬𭠩𛡤𨑴𨐊𩡯𬠨𨰮𭱩𩁴𪁼👷👩🐹𜰶𞱩𛐭𞰩𩐽𪐥𭠪𝠬𩁛𪐪𝀫𜱝🠵𜁼𯁸𛡦𪑬𫁒𩑣𭀨𦀽𩐫𩐯𜠪𤰨𭀭𪐯𭰩𚱷𛁩𛰳𛑥𚡃𚁴𛑘𛰹𞐩𚱥𚰵𜀬𞐬𪐼𜐿𭰺𞐩`
returns
%uD878%uDC2E%uD866%uDC6F%uD86E%uDC74%uD83D%uDC27%uD832%uDC65%uD86D%uDC20%uD861%uDC27%uD82C%uDC78%uD82E%uDC66%uD869%uDC6C%uD86C%uDC54%uD865%uDC78%uD874%uDC28%uD827%uDC5C%uD875%uDC32%uD836%uDC32%uD830%uDC20%uD827%uDC2C%uD833%uDC2C%uD832%uDC35%uD829%uDC0A%uD874%uDC3F%uD830%uDC3A%uD864%uDC3D%uD878%uDC2E%uD867%uDC65%uD874%uDC49%uD86D%uDC61%uD867%uDC65%uD844%uDC61%uD874%uDC61%uD828%uDC30%uD82C%uDC30%uD82C%uDC76%uD83D%uDC33%uD836%uDC2C%uD876%uDC29%uD82E%uDC64%uD861%uDC74%uD861%uDC0A%uD866%uDC6F%uD872%uDC28%uD863%uDC2E%uD877%uDC69%uD864%uDC74%uD868%uDC7C%uD83D%uDC77%uD83D%uDC69%uD83D%uDC39%uD833%uDC36%uD83B%uDC69%uD82D%uDC2D%uD83B%uDC29%uD865%uDC3D%uD869%uDC25%uD876%uDC2A%uD836%uDC2C%uD864%uDC5B%uD869%uDC2A%uD834%uDC2B%uD833%uDC5D%uD83E%uDC35%uD830%uDC7C%uD87C%uDC78%uD82E%uDC66%uD869%uDC6C%uD86C%uDC52%uD865%uDC63%uD874%uDC28%uD858%uDC3D%uD865%uDC2B%uD865%uDC2F%uD832%uDC2A%uD853%uDC28%uD874%uDC2D%uD869%uDC2F%uD877%uDC29%uD82B%uDC77%uD82C%uDC69%uD82F%uDC33%uD82D%uDC65%uD82A%uDC43%uD828%uDC74%uD82D%uDC58%uD82F%uDC39%uD839%uDC29%uD82B%uDC65%uD82B%uDC35%uD830%uDC2C%uD839%uDC2C%uD869%uDC3C%uD831%uDC3F%uD877%uDC3A%uD839%uDC29
which then gets regex replaced with .replace(/u../g,''), but getting this string from minified javascript isn't easy for me.
simply running encodeURIComponent() or escape() doesn't get you quite there, though it gets you part of the way there.
So how do I get the string of my javascript converted into a string containing %uD then the character code for each?
I am also on dwitter.
The code compressor actually began with a dweet (https://www.dwitter.net/d/23092).
It was made so people could add more bytes into their demos by going right up to 194 chars instead of having the limit of 140.
Note this does not reduce the byte size.
Even though this reduces the amount of characters, the size stays the same
There is also an uncompressor at https://www.dwitter.net/d/14246
The simplified code for this is a simple unpack function:
function unpack(strange_blocky_code) {
const index = code.toLowerCase().search(/eval\(unescape\(escape`/g)
if (index >= 0) {
const start = strange_blocky_code.slice(0, index)
const end = strange_blocky_code.slice(index)
const result = eval(end.slice(4))
if (result) return start + result // returns readable (but trivial) code
}
}
The simplified compressing code is:
function compress(readable_code) {
const value = [...readable_code.trim()]
let code = ''
for (let character of value) {
const char = character.charCodeAt(0)
if (char > 255) character = escape(character).replace(/%u/g, "\\u")
code += character
}
const compressed =
String.fromCharCode(...[...code.length % 2 ? code + ";" : code]
.map((item, index) =>
item.charCodeAt() | (index % 2 ? 0xDF00 : 0xDB00)
)
)
return `eval(unescape(escape\`${compressed}\`.replace(/u../g,'')))`
}
If you're looking for editors, these are two that I like to use:
https://greyhope.uk/Dweet-Runner/index.html made by GreyHope
https://dweetabase.3d2k.com/ made by Frank Force
I hope this helps at all.

Converting time to hexidecimal then to string using javascrit

I am trying to convert present time to hexidecimal then to a regular string variable.
For some reason I can only seem to produce an output in double quotes such as "result" or an object output. I am using Id tags to identify each div which contains different messages. They are being used like this id="somename-hexnumber". The code if sent from the browser to a node.js server and the ID is split up into two words with first section being the person's name then "-" is the split key then the hexidecimal is just the div number so it is easy to find and delete if needed. The code I got so far is small but I am out of ideas now.
var thisRandom = Date.now();
const encodedString = thisRandom.toString(16);
var encoded = JSON.stringify(encodedString);
var tIDs = json.name+'-'+encoded;
var output = $('<div class="container" id="'+tIDs+'" onclick="DelComment(this.id, urank)"><span class="block"><div class="block-text"><p><strong><'+json.name+'></strong> '+json.data+'</p></div></div>');
When a hexidecimal number is produced I want the output to be something like 16FE67A334 and not "16FE67A334" or an object.
Do you want this ?
Demo: https://codepen.io/gmkhussain/pen/QWEdOBW
Code below will convert the time/number value d to hexadecimal.
var thisRandom = Date.now();
function timeToHexFunc(x) {
if ( x < 0) {
x = 0xFFFFFFFF + x + 1;
}
return x.toString(16).toUpperCase();
}
console.log(timeToHexFunc(thisRandom));

Running script function from editor doesn't work as expected

So, I never ever programmed JavaScript and never did anything with Google Script before either. I have a fairly good understanding of Visual Basic and macros in Excel and Word. Trying to make a fairly basic program: Plow through a list of variables in a spreadsheet, make a new sheet for each value, insert a formula in this new sheet, cell (1,1).
Debug accepts my program, no issues - however, nothing at all is happening when I run the program:
function kraft() {
var rightHere =
SpreadsheetApp.getActiveSpreadsheet().getActiveSheet().getRange("A1:A131");
var loopy;
var goshDarn = "";
for (loopy = 1; loopy < 132; loopy++) {
celly = rightHere.getCell(loopy,1);
vaerdi = celly.getValue();
fed = celly.getTextStyle();
console.log(vaerdi & " - " & fed);
if (vaerdi != "" && fed.isBold == false) {
SpreadsheetApp.getActiveSpreadsheet().insertSheet(vaerdi);
var thisOne = SpreadsheetApp.getActiveSpreadsheet().getSheetByName(vaerdi);
thisOne.deleteRows(500,500);
thisOne.deleteColumns(5, 23);
thisOne.getRange(1,1).setFormula("=ArrayFormula(FILTER('Individuelle varer'!A16:D30015,'Individuelle varer'!A16:A30015=" & Char(34) & vaerdi & Char(34) & ")))");
}
}
}
activeSheet could be called by name, so could activeSpreadsheet, I guess. But range A1:A131 has a ton of variables - some times there are empty lines and new headers (new headers are bold). But basically I want around 120 new sheets to appear in my spreadsheet, named like the lines here. But nothing happens. I tried to throw in a log thingy, but I cannot read those values anywhere.
I must be missing the most total basic thing of how to get script connected to a spreadsheet, I assume...
EDIT: I have tried to update code according to tips from here and other places, and it still does a wonderful nothing, but now looks like this:
function kraft() {
var rightHere = SpreadsheetApp.getActiveSpreadsheet().getActiveSheet().getRange("A1:A131");
var loopy;
var goshDarn = "";
for (loopy = 1; loopy < 132; loopy++) {
celly = rightHere.getCell(loopy,1);
vaerdi = celly.getValue();
fed = celly.getFontWeight();
console.log(vaerdi & " - " & fed);
if (vaerdi != "" && fed.isBold == false) {
SpreadsheetApp.getActiveSpreadsheet().insertSheet(vaerdi);
var thisOne = SpreadsheetApp.getActiveSpreadsheet().getSheetByName(vaerdi);
thisOne.deleteRows(500,500);
thisOne.deleteColumns(5, 23);
thisOne.getRange(1,1).setFormula("=ArrayFormula(FILTER('Individuelle varer'!A16:D30015,'Individuelle varer'!A16:A30015=" + "\"" + vaerdi + "\"" + ")))");
}
}
}
EDIT2: Thanks to exactly the advice I needed, the problem is now solved, with this code:
function kraft() {
var rightHere = SpreadsheetApp.getActiveSpreadsheet().getActiveSheet().getRange("A1:A131");
var loopy;
for (loopy = 1; loopy < 132; loopy++) {
celly = rightHere.getCell(loopy,1);
vaerdi = celly.getValue();
fed = celly.getFontWeight()
console.log(vaerdi & " - " & fed);
if (vaerdi != "" && fed != "bold") {
SpreadsheetApp.getActiveSpreadsheet().insertSheet(vaerdi);
var thisOne = SpreadsheetApp.getActiveSpreadsheet().getSheetByName(vaerdi);
thisOne.deleteRows(500,499);
thisOne.deleteColumns(5, 20);
thisOne.getRange(1,1).setFormula("=ArrayFormula(FILTER('Individuelle varer'!A16:D30015;'Individuelle varer'!A16:A30015=" + "\"" + vaerdi + "\"" + "))");
}
}
}
There are multiple issues with your script, but the main one is that you never actually call the isBold() function in your 'if' statement.
if (value && format.isBold() == false) {
//do something
}
Because you omitted the parentheses in 'fed.isBold', the expression never evaluates to 'true'. 'isBold' (without the parentheses) is of type Object as it's a function.
There are other issues that prevent the script from running properly:
Not using the 'var' keyword to declare variables and polluting the global scope. As a result, all variables you declare within your 'for' loop are not private to your function. Instead, they are attached to the global object and are accessible outside the function. https://prntscr.com/kjd8s5
Not using the built-in debugger. Running the function is not debugging. You should set the breakpoints and click the debug button to execute your function step-by-step and examine all values as it's being executed.
Deleting the non-existent columns. When you create the new sheet, you call the deleteColums(). There are 26 columns in total. The 1st parameter is the starting column while the 2nd one specifies how many columns must be deleted. Starting from column 5 and telling the script to remove 23 columns will throw an exception. Always refer to the documentation to avoid such errors.
console.log doesn't exist within the context of the Script Editor. You are NOT executing the scripts inside your browser, so Browser object model is not available. Use Logger.log(). Again, this is detailed in the documentation.
Your formula is not formatted properly.
JS is a dynamically typed language that's not easy to get used to. If you don't do at least some research prior to writing code, you'll be in for a lot of pain.

Unable to Get Output From While Loop in Javascript

I'm working on my final project of the Winter 2017 quarter to demonstrate how to use Regular Expressions in both C# and JavaScript code behind pages. I've got the C# version of my demonstration program done, but the JavaScript version is making me pull what little hair I have left on my head out (no small achievement since I got a fresh buzz cut this morning!). The problem involves not getting any output after applying a Regular Expression in a While loop to get each instance of the expression and printing it out.
On my HTML page I have an input textarea, seven radio buttons, an output textarea, and two buttons underneath (one button is to move the output text to the input area to perform multiple iterations of applying expressions, and the other button to clear all textareas for starting from scratch). Each radio button links to a function that applies a regular expression to the text in the input area. Five of my seven functions work; the sixth is the one I can't figure out, and the seventh is essentially the same but with a slightly different RegEx pattern, so if I fix the sixth function, the seventh function will be a snap.
(I tried to insert/upload a JPG of the front end, but the photo upload doesn't seem to be working. Hopefully you get the drift of what I've set up.)
Here are my problem children from my JS code behind:
// RegEx_Demo_JS.js - code behind for RegEx_Demo_JS
var inputString; // Global variable for the input from the input text box.
var pattern; // Global variable for the regular expression.
var result; // Global variable for the result of applying the regular expression to the user input.
// Initializes a new instance of the StringBuilder class
// and appends the given value if supplied
function StringBuilder()
{
var strings = [];
this.append = function (string)
{
string = verify(string);
if (string.length > 0) strings[strings.length] = string;
}
this.appendLine = function (string)
{
string = verify(string);
if (this.isEmpty())
{
if (string.length > 0) strings[strings.length] = string;
else return;
}
else strings[strings.length] = string.length > 0 ? "\r\n" + string : "\r\n";
}
this.clear = function () { strings = []; };
this.isEmpty = function () { return strings.length == 0; };
this.toString = function () { return strings.join(""); };
var verify = function (string)
{
if (!defined(string)) return "";
if (getType(string) != getType(new String())) return String(string);
return string;
}
var defined = function (el)
{
// Changed per Ryan O'Hara's comment:
return el != null && typeof(el) != "undefined";
}
var getType = function (instance)
{
if (!defined(instance.constructor)) throw Error("Unexpected object type");
var type = String(instance.constructor).match(/function\s+(\w+)/);
return defined(type) ? type[1] : "undefined";
}
}
Within the code of the second radio button (which will be the seventh and last function to complete), I tested the ScriptBuilder with data in a local variable, and it ran successfully and produced output into the output textarea. But I get no output from this next function that invokes a While loop:
function RegEx_Match_TheOnly_AllInstances()
{
inputString = document.getElementById("txtUserInput").value;
pattern = /(\s+the\s+)/ig; // Using an Flag (/i) to select either lowercase or uppercase version. Finds first occurrence either as a standalone word or inside a word.
//result = pattern.exec(inputString); // Finds the first index location
var arrResult; // Array for the results of the search.
var sb = getStringBuilder(); // Variable to hold iterations of the result and the text
while ((arrResult = pattern.exec(inputString)) !==null)
{
sb.appendLine = "Match: " + arrResult[0] ;
}
document.getElementById("txtRegExOutput").value = sb.toString();
/* Original code from C# version:
// string pattern = #"\s+(?i)the\s+"; // Same as above, but using Option construct for case insensitive search.
string pattern = #"(^|\s+)(?i)the(\W|\s+)";
MatchCollection matches = Regex.Matches(userTextInput, pattern);
StringBuilder outputString = new StringBuilder();
foreach (Match match in matches)
{
string outputRegExs = "Match: " + "\"" + match.Value + "\"" + " at index [" + match.Index + ","
+ (match.Index + match.Length) + "]" + "\n";
outputString.Append(outputRegExs);
}
txtRegExOutput.Text = outputString.ToString();
*/
} // End RegEx_Match_The_AllInstances
I left the commented code in to show what I had used in the C# code behind version to illustrate what I'm trying to accomplish.
The test input/string I used for this function is:
Don’t go there. If you want to be the Man, you have to beat The Man.
That should return two hits. Ideally, I want it to show the word that it found and the index where it found the word, but at this point I'd be happy to just get some output showing every instance it found, and then build on that with the index and possibly the lastIndex.
So, is my problem in my While loop, the way I'm applying the StringBuilder, or a combination of the two? I know the StringBuilder code works, at least when not being used in a loop and using some test data from the site I found that code. And the code for simply finding the first instance of "the" as a standalone or inside another word does work and returns output, but that doesn't use a loop.
I've looked through Stack Overflow and several other JavaScript websites for inspiration, but nothing I've tried so far has worked. I appreciate any help anyone can provide! (If you need me to post any other code, please advise and I'll be happy to oblige.)

regex jquery to replace number with link

I have a page of database results where users occasionally type in a reference to another post. (The database is day event tracker for a scheduling office).
The reference to the other post is always the posts ID (format of 001234). We uses these to match events with dockets and other paperwork from truck drivers. It is always a 6 digit number on its own.
<div class="eventsWrapper">
Data from DB is output here using PHP in a foreach loop.
Presents data in similar fashion to facebook for example.
</div>
What I need to do is once the data in the above DIV is loaded, then go through and replace every whole 6 digit number (not part of a number) with the number as a hyperlink.
It is important it only looks for number with a space either side:
EG: Ref 001122 <- like this - not like this -> ignore AB001122
Once I have the hyperlink tag I can make the reference number clickable to take users directly to that post.
I am not that good with regex but think it is something like:
\b(![0-9])?\d{6}\b
I have no idea how to search the DIV and then replace that regex with the hyperlink. Appreciate any help.
(?:^| )\d{6}(?= |$)
You can use this and replace by <space><whateveryouwant>.See demo.
https://regex101.com/r/bW3aR1/7
\b wont works cos A1 is not a word boundary which you want.
Something like this? Make an array of the individual posts and loop through. If there is only ever one ID in a post, you can do without the second loop.
var str = ['Ref 001122 <- like this - not like this -> ignore AB001122', 'Ref 001123 <- like this - not like this -> ignore AB001122', 'Ref 001124 <- like this - not like this -> ignore AB001122'];
var regex = /\b\d{6}\b/g;
for (var j = 0; j < str.length; j++) {
var urls = str[j].match(regex);
for (var i = 0; i < urls.length; i++) {
var url = urls[i];
newString = str[j].replace('' + urls[i] + '', '<a href = ' + url + '>' + urls[i] + '</a>')
}
$('#output').append(newString);
}
<script src="https://ajax.googleapis.com/ajax/libs/jquery/1.11.1/jquery.min.js"></script>
<div id="output"></div>

Categories