The mathematica.SE is currently in private beta and will open to the public in a few days. Stack Overflow and related sites use prettify.js, however Mathematica is not a supported language. It would be pretty awesome to have a custom highlighting script for our site, and I request the JavaScript and CSS community's help in developing a such a script and the accompanying CSS.
I've listed below a few basic requirements such that it captures most of the features of Mathematica's default highlighting scheme (ignoring stuff that only the internal parser would know). I've also named the colours generically – hexadecimal colour codes can be picked from the screenshots I've provided (further below). I've also added code samples to accompany the screenshots so that folks can test it out.
Basic requirements
Comments
These are entered as (* comment *). So anything between these should be highlighted in gray.
Strings
These are entered as "string"(single quotes are not supported), and should be highlighted in pink.
Operators/short hand notations
Apart from the standard +, -, *, /, ^, ==, etc., Mathematica has several other operators and short hand notations. The most commonly encountered ones are:
#, ##, ###, /#, //#, //, ~, /., //., ->, :>, /:, /;, :=, :^=, =.,
&, |, ||, &&, _, __, ___, ;;, [[, ]], <<, >>, ~~, <>
These, and parenthesis, brackets and braces should all be highlighted in black.
Patterns objects and slots
Pattern objects start with a letter and have either _ or __ or ___ attached, like for example, x_, x__ and x___. These can also have additional letters following the underscore, as x_abc, etc. All of these should be highlighted in green.
Slots are # and ## and can also be followed by an integer as #1, ##4, etc., and should also be in green.
Both of these (pattern objects and slots) are usually terminated by an operator/bracket/shortform from point 3 above.
Functions/variables
Functions and variables is a rather loose terminology here, but serves for the purposes of this post. Anything not falling in the above 4 can be highlighted in black. Mathematica often uses backticks ` in code and should be considered part of the function/variable name. E.g., abcd`defg. Dollar signs $ anywhere in a variable name is to be treated just like a letter (i.e., nothing special).
For all of the above, if they appear inside strings, they should be treated as such, i.e. "#~# should be highlighted in pink.
Additional nice to haves:
In the pattern objects in point 3 above, if the underscore(s) is followed by a ? and then some letters, then the part following the _ should be in black. E.g., in x__?abc, the x__ part must be in green and the ?abc in black.
if a function/variable starts with a capital letter, then it is highlighted in black. If it starts with a small letter, it is highlighted in blue. Internally, this differentiates built-in functions vs. user defined functions. However, the mathematica community (pretty much everywhere) sticks to this naming convention fairly well, so distinguishing the two would serve some purpose.
Screenshots & code samples:
1. Simple examples
Here's a small example set, with a screenshot at the end showing how it looks in Mathematica:
(*simple pattern objects & operators*)
f[x_, y__] := x Times ## y
(*pattern objects with chars at the end and strings*)
f[x_String] := x <> "hello#world"
(*pattern objects with ?xxx at the end*)
f[x_?MatrixQ] := x + Transpose#x
<< Combinatorica` (*example with backticks and inline comment*)
(*Slightly more complicated example with a mix of stuff*)
Developer`PartitionMap[Total, Range#1000, 3][[3 ;; -3]]~Partition~2 //
Times ### # &
2. A real world example
Here's an example from this answer of mine that also indicates my point 2 in the "Additional nice to haves" section, i.e., lowercase stuff being highlighted in blue.
Also, you might notice some of the variables highlighted in orange – I purposefully didn't include that as a requirement, as I think that's going to be a lot harder to do without a parser that knows Mathematica.
prob = MapIndexed[#1/#2 &,
Accumulate[
EuclideanDistance[{0, 0}, #] < 1 & /# arrows // Boole]]~N~4;
Manipulate[
Graphics[{White, Rectangle[{-5, -5}, {5, 5}], Red, Disk[{0, 0}, 1],
Black, Point[arrows[[;; i]]],
Text[Style[First#prob[[i]], Bold, 18, "Helvetica"], {-4.5, 4.5}]},
ImageSize -> 200], {i, Range[2, 20000, 1]},
ControlType -> Manipulator, SaveDefinitions -> True]
Is this feasible? Too much? Too hard? Impossible?
Quite frankly, I don't know the answer to any of those. I just listed some basic features that everyone on mathematica.SE would love to have and some additional stuff that would be a cherry on the top. However, do let me know if these are too difficult to implement. We can work out a smaller subset of features.
In recognition of this help, you all have the Mathematica community's eternal gratitude and in addition, I'll award a 500 bounty to each person that contributes significantly to this (if it's done in parts by different folks) – I'll rely on your votes/comments/output on the answers to decide what's significant (perhaps more than one bounty to one person if they do all the work). Implementing the "Additional nice to haves" gets an automatic +500 regardless of previous bounties, so you can also build upon the work of others even if you don't do the first half. I might also periodically place smaller bounties to attract users who might not have seen this question, so if you happen to earn those bounties, they'll be in addition to the "bounty to reward an existing answer" which will be decided towards the end.
Lastly, I'm not in a hurry. So please take your time with this question. The bounty is always an option until it is implemented by SE (or if it has been determined that existing answers satisfy the requirements completely). Ideally, I'm hoping to get this implemented 2/3rs of our way into the beta, which is 2 months from now.
Preface
Since the Mathematica support for google-code-prettify was mainly developed for the new Mathematica.Stackexchange site, please see also the discussion here.
Introduction
I have no deep knowledge of all of this, but there were times when I wrote a cweb plugin for Idea to have my code highlighted there. In an IDE all this is not a one step process. It is divided into several steps and each step has more highlighting-abilities. Let me explain this a bit to give later some reasons why some things are (imho) not possible for a code-highlighter we need here.
At first the code is split into tokens which are the single parts of a programming language. After this lexer you can categorize intervals of your code into e.g. whitespace, literal, string, comment, and so on. This lexer eats the source-code by testing regular expressions, storing the token-type for a text-span and stepping forward in the code.
After this lexical scan the source-code can be parsed by using the rules of the programming language, the tokens and the underlying code. For instance, if we have a token Plus which is of type Keyword then we know that the brackets and the parameter should follow. If not, the syntax is not correct. What you can build with this parsing is called an AST, abstract syntax tree, and looks basically like the TreeForm of Mathematica syntax.
With a nicely designed language, like Java for instance, it is possible to the check the code while typing and make it almost impossible to write syntactically wrong code.
prettify.js and Mathematica Code
First, the prettify.js implements only a lexical scanner, but no parser. I'm pretty sure, that this would be impossible anyway regarding the time-constrains for displaying a web-page. So let me explain what features are not possible/feasible with prettify.js:
Also, you might notice some of the variables highlighted in orange – I
purposefully didn't include that as a requirement, as I think that's
going to be a lot harder to do without a parser that knows
Mathematica.
Right, because the highlighting of these variables depends on the context. You have to know, that you are inside a Table construct or something like that.
Hacking prettify.js
I think hacking an extension for prettify.js is not so hard. I'm an absolute regular expression noob, so be prepared of what follows.
We don't need so much stuff for a simple Mathematica lexer.
We have whitespace, comments, string-literals, braces, a lot of operators, usual literals like variables and a giant list of keywords.
Lets start, with the keywords in java-script regexp-form:
Export["google-code-prettify/keywordsmma.txt",
StringJoin ## Riffle[Apply[StringJoin,
Partition[Riffle[Names[RegularExpression["[A-Z].*"]],
"|"], 100], {1}], "'+ \n '"], "TEXT"]
The regular expression for whitespace and string-literals can be copied from another language. Comments are matched by something like
/^\(\*[\s\S]*?\*\)/
This runs wrong if we have comments inside comments, but for the moment I don't care. We have braces and brackets
/^(?:\[|\]|{|}|\(|\))/
We have something like blub_boing which should be matched separately.
/^[a-zA-Z$]+[a-zA-Z0-9$]*_+([a-zA-Z$]+[a-zA-Z0-9$]*)*/
We have the slots #, ##, #1, ##9 (currently only one digit can follow)
/^#+[0-9]?/
We have variable names and other literals. They need to start with either a letter or $ and then can follow letters, numbers and $. Currently \[Gamma] is not matched as one literal but for the moment it's ok.
/^[a-zA-Z$]+[a-zA-Z0-9$]*/
And we have operators (I'm not sure this list is complete).
/^(?:\+|\-|\*|\/|,|;|\.|:|#|~|=|\>|\<|&|\||_|`|\^)/
Update
I cleaned the stuff a bit up, did some debugging and created a color-style which looks beautiful to me. The following stuff works as far as I can see correctly:
All system symbols which can be found through Names[RegularExpression["[A-Z].*"]] are matched and highlighted in blue
Braces and brackets are black but bold font-weight. This was an suggestion from Szabolcs and I like it very much since it definitely add some energy to the appearance of the code
Patterns, as they appear in function definitions and the slots of pure functions are highlighted in green. This was suggested by Yoda and goes along with the highlighter in the Mathematica frontend. Patterns are only green in combination with a variable like in blub__Integer, a1_ or in b34_Integer32. Testfunctions for the pattern like in num_?NumericQ are only green infront of the question mark.
Comments and Strings have the same color. Comments and strings can go over several lines. Strings can include backslashed quotes. Comments cannot be nested.
For the coloring I used consistently the ColorData[1] scheme to ensure colors look nice side by side.
Currently it looks like that:
Testing and debugging
Szabolcs asked whether and how it is possible to test this. This is easy: You need my google-code-prettify source (Where can I put this, so that everyone has access?). Unpack the sources and open the file tests/mathematica_test.html in a webbrowser. This file loads by itself the files src/prettify.js, src/lang-mma.js and src/prettify-mma-1.css.
in lang-mma.js you find the regular expression the lexer is using when splitting the code into tokens.
in prettify-mma-1.css you find the style definitions I use
To test your own code, simply open mathematica_test.html in an editor and paste your stuff between the pre tags. Reload the page and your code should appear.
Debugging: If the highlighter is not working correctly, you can debug with an IDE or with Google-Chrome. In Chrome you mark the word where the highlighter starts to fail and make right-klick and Inspect Element. What you see then is the underlying html-highlight code. There you can see every single token and you see which type the token is. This looks then like
<span class="tag">[</span>
You see the open bracket is of type tag. This matches with the regexp definition I made in lang-mma.js. In Chrome it is even possible to browse the JS code, set breakpoints and debug it while reloading your page.
Local installation for Google Chrome and Firefox
Tim Stone was so kind to write a script which injects the highlighter during the loading of sites under http://stackoverflow.com/questions/. As soon as google-code-prettify is turned on for mathematica.stackexchange.com it should work there too.
I adapted this script to use my lexical scanning rules and colors. I heard that in Firefox the script is not always working, but this is how to install it:
Chrome: Follow this link https://github.com/halirutan/Mathematica-Source-Highlighting/raw/master/mathematica-source-highlighter.user.js and you should be prompted whether you want to install this extension.
Firefox: ensure you have the Greasemonkey plugin installed. Then download the same link as for Chrome.
Now you are set up and when you reload this page, comments, kernel-functions, strings and patterns should be highlighted correctly.
Versions
Under https://github.com/halirutan/Mathematica-Source-Highlighting/raw/master/mathematica-source-highlighter.user.js you will always find the most recent version. Here is some change history.
- 02/23/2013 Updated the lists of symbols and keywords to Mathematica version 9.0.1
- 09/02/2012 some minor issues with the coloring of Mathematica-patterns were fixed. For a detailed overview of features with Pattern-operator : see also the discussion here
02/02/2012 support of many number input formats like .123`10.2 or 1.2`100.3*^-12, highlighting of In[23] and Out[4], ::usage or other messages like blub::boing, highlighting of patterns like ProblemTest[prob:(findp_[pfun_, pvars_, {popts___}, ___]), opts___], bug-fixes (I checked the parser against 3500 lines of package code from the AddOns directory. It took about 3-4 sec to run, which should be more than fast enough for our purposes.)
01/30/2012 Fixed missing '?' in the operator list. Included named-characters like \\[Gamma] to give a complete match for such symbols. Added $variables in the keyword list. Improved the matching of patterns. Added matching of context constructions like Developer`PackedArrayQ. Switch of the color-scheme due to many requests. Now it's like in the Mathematica-frontend. Keywords black, variables blue.
01/29/2012 Tim hacked to injecting code. Now the highlighting works on mathematica.stackexchange too.
01/25/2012 Added the recognition of Mathematica-numbers. This should now highlight things like {1, 1.0, 1., .12, 16^^1.34f, ...}. Additionally it should recognize the backtick behind a number. I switched comments and strings to gray and use a dark red for the numbers.
01/23/2012 Initial version. Capabilities are described under section Update.
Not exactly what you are asking for, but I created a similar extension for MATLAB (based on the excellent work already done here). The project is hosted on github.
The script should solve some of the issues common for MATLAB code on Stack Overflow:
comments (no need to use tricks like %# ..)
transpose operator (single quote) is correctly recognized as such (confused with quoted strings by the default prettifier)
highlighting of popular built-in functions
Keep in mind the syntax highlighting is not perfect; among other things, it fails on nested block comments (I can live with that for now). As always, comments/fixes/issues are welcome.
A separate userscript is included, it allows switching the language used as seen in the screenshot below:
--- before ---
--- after ---
For those interested, a third userscript is provided, adapted to work on "MATLAB Answers" website.
TL;DR
Install the userscript for SO directly from:
https://github.com/amroamroamro/prettify-matlab/raw/master/js/prettify-matlab.user.js
Related
I have a large corpus of ES6 code. A search returns all occurences of a string or regular expression, but I'd like an option that will either show or hide hits within comments. Atom can already parse the language enough to discern code vs comments, so it shouldn't be much of a stretch to make search results sensitive to code, comments or both.
Does anyone know if a plugin to enable this behavior already exists? I can't seem to find it if it does, and if I get the urge to write it, I don't want to just be spinning wheels.
I am parsing a series of strings with various formats. The last edge case encountered has me stumped. I'm not a great regexer, believe me it was a challenge to get to this point.
Here are critical snippets from the strings I'm trying to parse. The second example is the current edge case I'm stuck on.
LBP824NW2-58.07789x43.0-207C72
LBP824WW1-77.6875 in. x 3.00 in. 24VDC
I am trying to grab all of the digits (including the decimal) that make up the width part of the dimension in the string (this would be the first number in the dimension). What works in every other case has been grabbing all digits from the "-" to the "x" using the following expression:
/-(\d+\.?\d+?)x\B/
However, this does not handle the cases that have inches included in the dimension. I thought about using "look-aheads" or "look-behinds", but I got confused. Any suggestions would be appreciated.
RegEx can be told to look for "zero or one" of things, using (...)? syntax, so if your pattern already works but it gets confused by a new pattern that simply has "more string data embedded in what is otherwise the same pattern" you can add in zero-or-one checks and you should be good to go.
In this case, putting something like (\s*in\.?\s*)? in a few tactical places to either match "any number of spaces (including none) followed by in followed by an optional full stop followed by any number of spaces (including none)" or nothing should work.
That said, "I cannot change the formatting" is almost never an argument, because while you can't change the formatting, you can almost always change what parses it. RegEx might be adequate, but some code that checks for what kind of general patter it is, and then calls the appropriate function for tokenizing and inspecting that specific string pattern should be quite possible. Unless you've been hired to literally update some predefined CLi script that has a grep in it and you're not allowed to touch anything except for the pattern...
This is the working solution using regex: -(\d+\.?\d+?)(\s*in\.?\s*|x)
I have been validating my JavaScript using JSLint for about 2 years now and once in a while there are rules that change. In general when JSLint introduces a new rule, there is a checkbox to ignore this rule when parsing, or if you choose to not ignore it then to make your code compliant to it.
As I was running my JSLint validation today, however, I run into these two new errors:
Use spaces, not tabs.
This is not the "mixing of tabs and spaces" error. I am using only tabs. This is a recently modified version of "mixing of tabs and spaces" which now disallows tabs in general.
And:
Unsafe character.
*/
Unsafe character.
_const: {
There are no new options to ignore. I cannot understand what is unsafe about closing a block comment, why it considers _const: { as unsafe when I have nomen: true, (dangling _ in identifiers) or why should I be suddenly switching from spaces to tabs, when I still have the configuration about indentation of 4 spaces being a tab.
Does anyone have an idea why those were introduced to at least how to make JSLint ignore these new rules?
Update:
The Messy White Space option works around the issue but it would cause other unexpected behavior:
if (condition) {
// ^-- there is a space but it won't indicate an error
Well it looks like Douglas Crockford just made a whole lot more people switch to JSHint. Have a look at this commit.
The "Mixed spaces and tabs" error has been removed, and a new "Use spaces, not tabs" error has been added in its place. Aside from that, there's one tiny change in that diff that shows the cause of this. The following line (comment added):
at = source_row.search(/ \t/);
// ^ Space
has been replaced with this:
at = source_row.search(/\t/);
// ^ No space!
Following that search there's an if statement. If the condition evaluates to true, the "Use spaces, not tabs" warning is issued. Here's that statement:
if (at >= 0) {
warn_at('use_spaces', line, at + 1);
}
I hope that this is just a little oversight by Crockford. As you can see, JSLint is now going to raise this warning if you use a tab character anywhere. Unfortuately, his commit messages are completely useless, and the documentation doesn't appear to have been updated, so I can't do anything other than speculate as to the reasons behind this change.
I suggest you abandon JSLint and switch to JSHint right now.
You can suppress the error by clicking the "messy white space" option.
To answer why JSLint now gives an error about tabs, http://www.jslint.com/help.html gives this justification:
Tabs and spaces should not be mixed. We should pick just one in order
to avoid the problems that come from having both. Personal preference
is an extremely unreliable criteria. Neither offers a powerful
advantage over the other. Fifty years ago, tab had the advantage of
consuming less memory, but Moore's Law has eliminated that advantage.
Space has one clear advantage over tab: there is no reliable standard
for how many spaces a tab represents, but it is universally accepted
that a space occupies a space. So use spaces. You can edit with tabs
if you must, but make sure it is spaces again before you commit. Maybe
someday we will finally get a universal standard for tabs, but until
that day comes, the better choice is spaces.
Essentially he wants everybody to come to a consensus on whether to use tabs or spaces to prevent them from ever being mixed. He's decided that the consistency of the width of a space makes it the superior choice, so we should all use that. Clearly some people will disagree with this line of thinking (myself included), but that is the reason why JSLint throws that error.
Depending on your editor/IDE you can adjust how TAB works.
For instance, I use Sublime Text.
Near the bottom right corner there is a Tab Size : 4.
I clicked on it and set it 'Indent Using Spaces'.
This updated all my Tabs to use spaces and JSLint errors disapeared. I try to use as few options as possible with JSLint as I want my code to be well structured.
I also use JSFormat, which will tab based on my editors settings, so whenever I'm done I run my JSFormat, then JSLint. No errors = happy boy!
Hope it helps.
How do I enable automatic folding in Vim? set foldmethod=syntax doesn't seem to do much of anything.
To allow folds based on syntax add something like the following to your .vimrc:
set foldmethod=syntax
set foldlevelstart=1
let javaScript_fold=1 " JavaScript
let perl_fold=1 " Perl
let php_folding=1 " PHP
let r_syntax_folding=1 " R
let ruby_fold=1 " Ruby
let sh_fold_enabled=1 " sh
let vimsyn_folding='af' " Vim script
let xml_syntax_folding=1 " XML
Syntax based folding is defined in the syntax files of the language which are located in $VIM/syntax or /usr/share/vim/vimXX/syntax/. But some languages do not have folding rules built into their syntax files; for example Python. For those languages you need to download something from http://vim.sf.net that does folds. Otherwise you will need to use folds based on indents. To do this effectively you will likely want to add the following to your .vimrc file:
set foldmethod=indent
set foldnestmax=2
Other kinds of folding
There are 6 types of folds:
manual manually define folds
indent more indent means a higher fold level
expr specify an expression to define folds
syntax folds defined by syntax highlighting
diff folds for unchanged text
marker folds defined by markers in the text
Personally, I only use syntax folds. Usually, I just want to fold the method and not fold every indent level. Inconsistent indenting and weirdly formatted legacy code at work often makes indent folding difficult or impossible. Adding marks to the document is tedious and people who do not use Vim will not maintain them when they edit the document. Manual folds work great until someone edits your code in source control and all your folds are now in the wrong place.
More reading
See :help fold-methods to learn the details of different fold methods.
See :help folding to learn the keyboard commands for manipulate folds.
See :help folds for help on the entire topic of folding.
JavaScript folding didn't work for me either. I found out when I did set syntax=javaScript (with a capital S), it suddenly worked.
Tried all of the solutions here and none worked with NeoVim v0.3.1
Until I found the vim-javascript plugin and folding started to work.
The way to enable folding in new versions of Vim has changed (I'm using vim 7.4). Now you should create the file ~/.vim/ftplugin/javascript.vim (on linux) and add your code folding instructions as explained in Eric Johnson's answer. File type detection and loading plugins for specific file types must be enabled by putting this into your .vimrc:
filetype plugin on
I have started the painful first steps of using emacs to edit an HTML file with both HTML tags and javascript content. I have installed nxhtml and tried using it - i.e set up to use nxhtml-mumamo-mode for .html files. But I am not loving it. When I am editing the Javascript portion of the code the tab indents do not behave as they do when editing C/C++ code. It starts putting tabs within the line and if you try and hit tab in the white space preceding a line it inserts the tab rather than re-tabifying the line.
Another aspect that I don't like is that it doesn't do syntax colouring like the usual C/C++ modes do. I much prefer the behaviour of the default java-mode when editing HTML files but that doesn't play nicely with the HTML code. :-(
1) Is there a better mode for editing HTML files with Javascript portions?
2) Is there a way to get nxhtml to use the default java-mode for the javascript portions?
Regards,
M
Another solution is multi-web-mode:
https://github.com/fgallina/multi-web-mode
which may be more easily configurable than the already mentioned multi-mode.
You just configure your preferred modes in your .emacs file like this:
(require 'multi-web-mode)
(setq mweb-default-major-mode 'html-mode)
(setq mweb-tags
'((php-mode "<\\?php\\|<\\? \\|<\\?=" "\\?>")
(js-mode "<script[^>]*>" "</script>")
(css-mode "<style[^>]*>" "</style>")))
(setq mweb-filename-extensions '("php" "htm" "html" "ctp" "phtml" "php4" "php5"))
(multi-web-global-mode 1)
More on Emacs's multiple multiple modes (sigh) here:
http://www.emacswiki.org/emacs/MultipleModes
UPDATE: simplified the regexps to detect JavaScript or CSS areas to make them work with HTML5 -- no need for super-precise and fragile regular expressions.
I have written the major mode web-mode.el for this kind of usage : editing HTML templates that embed JS, CSS, Java (JSP), PHP. You can download it on http://web-mode.org
Web-mode.el does syntax highlighting and indentation according to the type of the block.
Installation is simple :
(require 'web-mode)
(add-to-list 'auto-mode-alist '("\\.html$" . web-mode))
Great question. Look how many upvotes you got on your first one!
Everyone has the same experience as you. Me too.
Rather than rely on nhtml-mode which exhibited the same sort of strangeness for me as you described, I looked for another option and found multi-mode.el . It's a sort of general-purpose multi-mode skeleton. To use it, you need to specify regular expressions to describe where one mode starts and another one ends. So, you look for <script...> to start a javascript block, and <style...> to start a css block. Then you plug in your own modes for each block - if you like espresso for javascript, use it. And so on for the other regexes that identify other blocks.
In practice, as you navigate through the document, a different mode is enabled for each block.
I used multi-mode to produce an ASP.NET, that allows me to edit C#, HTML, CSS, and Javascript in a single file, with the proper highlighting and fontification depending on where the cursor is in the buffer. It isn't perfect but I found it to be a marked improvement on the existing possibilities. In fact this may be what you want. Try it out.
https://code.google.com/p/csharpmode/source/browse/trunk/aspx-mode.el?r=14
Not really a good solution but a quick fix if you really need to have javascript in your html is to select the region containing javascript and use the command narrow-to-region(C-x n n) and then switch to your preferred javascript mode. The command widen brings you back, (C-x n w).
It sounds like you've setup your nxhtml incorrectly. The only setup necessary should be loading the autostart.el file, and then everything should work to some level. nxhtml isn't perfect in any way, but my experiences from using it for html/css/javascript/mako is pretty good, at least for everything but mako. But I'm pretty sure I've screwed up with the mako-part.
This is how I initialize my nxhtml:
(when (load "autostart.el" t)
(setq nxhtml-skip-welcome t
mumamo-chunk-coloring 'submode-colored
indent-region-mode t
rng-nxml-auto-validate-flag nil))