UglifyJS

JavaScript parser / mangler / compressor / beautifier library for NodeJS

  • 所有者: mishoo/UglifyJS
  • 平台:
  • 許可證: Other
  • 分類:
  • 主題:
  • 喜歡:
    0
      比較:

Github星跟蹤圖

#+TITLE: UglifyJS -- a JavaScript parser/compressor/beautifier
#+KEYWORDS: javascript, js, parser, compiler, compressor, mangle, minify, minifier
#+DESCRIPTION: a JavaScript parser/compressor/beautifier in JavaScript
#+STYLE:
#+AUTHOR: Mihai Bazon
#+EMAIL: mihai.bazon@gmail.com

  • NEW: UglifyJS2

I started working on UglifyJS's successor, version 2. It's almost a full
rewrite (except for the parser which is heavily modified, everything else
starts from scratch). I've detailed my reasons in the README, see the
project page.

[[https://github.com/mishoo/UglifyJS2][https://github.com/mishoo/UglifyJS2]]

Version 1 will continue to be maintained for fixing show-stopper bugs, but
no new features should be expected.

  • UglifyJS --- a JavaScript parser/compressor/beautifier

This package implements a general-purpose JavaScript
parser/compressor/beautifier toolkit. It is developed on [[http://nodejs.org/][NodeJS]], but it
should work on any JavaScript platform supporting the CommonJS module system
(and if your platform of choice doesn't support CommonJS, you can easily
implement it, or discard the =exports.*= lines from UglifyJS sources).

The tokenizer/parser generates an abstract syntax tree from JS code. You
can then traverse the AST to learn more about the code, or do various
manipulations on it. This part is implemented in [[../lib/parse-js.js][parse-js.js]] and it's a
port to JavaScript of the excellent [[http://marijn.haverbeke.nl/parse-js/][parse-js]] Common Lisp library from [[http://marijn.haverbeke.nl/][Marijn
Haverbeke]].

( See [[http://github.com/mishoo/cl-uglify-js][cl-uglify-js]] if you're looking for the Common Lisp version of
UglifyJS. )

The second part of this package, implemented in [[../lib/process.js][process.js]], inspects and
manipulates the AST generated by the parser to provide the following:

  • ability to re-generate JavaScript code from the AST. Optionally
    indented---you can use this if you want to “beautify” a program that has
    been compressed, so that you can inspect the source. But you can also run
    our code generator to print out an AST without any whitespace, so you
    achieve compression as well.

  • shorten variable names (usually to single characters). Our mangler will
    analyze the code and generate proper variable names, depending on scope
    and usage, and is smart enough to deal with globals defined elsewhere, or
    with =eval()= calls or =with{}= statements. In short, if =eval()= or
    =with{}= are used in some scope, then all variables in that scope and any
    variables in the parent scopes will remain unmangled, and any references
    to such variables remain unmangled as well.

  • various small optimizations that may lead to faster code but certainly
    lead to smaller code. Where possible, we do the following:

    • foo["bar"] ==> foo.bar

    • remove block brackets ={}=

    • join consecutive var declarations:
      var a = 10; var b = 20; ==> var a=10,b=20;

    • resolve simple constant expressions: 1 +2 * 3 ==> 7. We only do the
      replacement if the result occupies less bytes; for example 1/3 would
      translate to 0.333333333333, so in this case we don't replace it.

    • consecutive statements in blocks are merged into a sequence; in many
      cases, this leaves blocks with a single statement, so then we can remove
      the block brackets.

    • various optimizations for IF statements:

      • if (foo) bar(); else baz(); ==> foo?bar():baz();
      • if (!foo) bar(); else baz(); ==> foo?baz():bar();
      • if (foo) bar(); ==> foo&&bar();
      • if (!foo) bar(); ==> foo, bar();
      • if (foo) return bar(); else return baz(); ==> return foo?bar():baz();
      • if (foo) return bar(); else something(); ==> {if(foo)return bar();something()}
    • remove some unreachable code and warn about it (code that follows a
      =return=, =throw=, =break= or =continue= statement, except
      function/variable declarations).

    • act a limited version of a pre-processor (c.f. the pre-processor of
      C/C++) to allow you to safely replace selected global symbols with
      specified values. When combined with the optimisations above this can
      make UglifyJS operate slightly more like a compilation process, in
      that when certain symbols are replaced by constant values, entire code
      blocks may be optimised away as unreachable.

** <>

The following transformations can in theory break code, although they're
probably safe in most practical cases. To enable them you need to pass the
=--unsafe= flag.

*** Calls involving the global Array constructor

The following transformations occur:

#+BEGIN_SRC js
new Array(1, 2, 3, 4) => [1,2,3,4]
Array(a, b, c) => [a,b,c]
new Array(5) => Array(5)
new Array(a) => Array(a)
#+END_SRC

These are all safe if the Array name isn't redefined. JavaScript does allow
one to globally redefine Array (and pretty much everything, in fact) but I
personally don't see why would anyone do that.

UglifyJS does handle the case where Array is redefined locally, or even
globally but with a =function= or =var= declaration. Therefore, in the
following cases UglifyJS doesn't touch calls or instantiations of Array:

#+BEGIN_SRC js
// case 1. globally declared variable
var Array;
new Array(1, 2, 3);
Array(a, b);

// or (can be declared later)
new Array(1, 2, 3);
var Array;

// or (can be a function)
new Array(1, 2, 3);
function Array() { ... }

// case 2. declared in a function
(function(){
a = new Array(1, 2, 3);
b = Array(5, 6);
var Array;
})();

// or
(function(Array){
return Array(5, 6, 7);
})();

// or
(function(){
return new Array(1, 2, 3, 4);
function Array() { ... }
})();

// etc.
#+END_SRC

*** =obj.toString()= ==> =obj+“”=

** Install (NPM)

UglifyJS is now available through NPM --- =npm install uglify-js@1= should
do the job.

NOTE: The NPM package has been upgraded to UglifyJS2. If you need to
install version 1.x you need to add @1 to the command, as I did above. I
strongly suggest you to try to upgrade, though this might not be simple (v2
has a completely different AST structure and API).

** Install latest code from GitHub

#+BEGIN_SRC sh

clone the repository

mkdir -p /where/you/wanna/put/it
cd /where/you/wanna/put/it
git clone git://github.com/mishoo/UglifyJS.git

make the module available to Node

mkdir -p ~/.node_libraries/
cd ~/.node_libraries/
ln -s /where/you/wanna/put/it/UglifyJS/uglify-js.js

and if you want the CLI script too:

mkdir -p ~/bin
cd ~/bin
ln -s /where/you/wanna/put/it/UglifyJS/bin/uglifyjs

(then add ~/bin to your $PATH if it's not there already)

#+END_SRC

** Usage

There is a command-line tool that exposes the functionality of this library
for your shell-scripting needs:

#+BEGIN_SRC sh
uglifyjs [ options... ] [ filename ]
#+END_SRC

=filename= should be the last argument and should name the file from which
to read the JavaScript code. If you don't specify it, it will read code
from STDIN.

Supported options:

  • =-b= or =--beautify= --- output indented code; when passed, additional
    options control the beautifier:

    • =-i N= or =--indent N= --- indentation level (number of spaces)

    • =-q= or =--quote-keys= --- quote keys in literal objects (by default,
      only keys that cannot be identifier names will be quotes).

  • =-c= or =----consolidate-primitive-values= --- consolidates null, Boolean,
    and String values. Known as aliasing in the Closure Compiler. Worsens the
    data compression ratio of gzip.

  • =--ascii= --- pass this argument to encode non-ASCII characters as
    =\uXXXX= sequences. By default UglifyJS won't bother to do it and will
    output Unicode characters instead. (the output is always encoded in UTF8,
    but if you pass this option you'll only get ASCII).

  • =-nm= or =--no-mangle= --- don't mangle names.

  • =-nmf= or =--no-mangle-functions= -- in case you want to mangle variable
    names, but not touch function names.

  • =-ns= or =--no-squeeze= --- don't call =ast_squeeze()= (which does various
    optimizations that result in smaller, less readable code).

  • =-mt= or =--mangle-toplevel= --- mangle names in the toplevel scope too
    (by default we don't do this).

  • =--no-seqs= --- when =ast_squeeze()= is called (thus, unless you pass
    =--no-squeeze=) it will reduce consecutive statements in blocks into a
    sequence. For example, "a = 10; b = 20; foo();" will be written as
    "a=10,b=20,foo();". In various occasions, this allows us to discard the
    block brackets (since the block becomes a single statement). This is ON
    by default because it seems safe and saves a few hundred bytes on some
    libs that I tested it on, but pass =--no-seqs= to disable it.

  • =--no-dead-code= --- by default, UglifyJS will remove code that is
    obviously unreachable (code that follows a =return=, =throw=, =break= or
    =continue= statement and is not a function/variable declaration). Pass
    this option to disable this optimization.

  • =-nc= or =--no-copyright= --- by default, =uglifyjs= will keep the initial
    comment tokens in the generated code (assumed to be copyright information
    etc.). If you pass this it will discard it.

  • =-o filename= or =--output filename= --- put the result in =filename=. If
    this isn't given, the result goes to standard output (or see next one).

  • =--overwrite= --- if the code is read from a file (not from STDIN) and you
    pass =--overwrite= then the output will be written in the same file.

  • =--ast= --- pass this if you want to get the Abstract Syntax Tree instead
    of JavaScript as output. Useful for debugging or learning more about the
    internals.

  • =-v= or =--verbose= --- output some notes on STDERR (for now just how long
    each operation takes).

  • =-d SYMBOL[=VALUE]= or =--define SYMBOL[=VALUE]= --- will replace
    all instances of the specified symbol where used as an identifier
    (except where symbol has properly declared by a var declaration or
    use as function parameter or similar) with the specified value. This
    argument may be specified multiple times to define multiple
    symbols - if no value is specified the symbol will be replaced with
    the value =true=, or you can specify a numeric value (such as
    =1024=), a quoted string value (such as ="object"= or
    ='https://github.com'=), or the name of another symbol or keyword
    (such as =null= or =document=).
    This allows you, for example, to assign meaningful names to key
    constant values but discard the symbolic names in the uglified
    version for brevity/efficiency, or when used wth care, allows
    UglifyJS to operate as a form of conditional compilation
    whereby defining appropriate values may, by dint of the constant
    folding and dead code removal features above, remove entire
    superfluous code blocks (e.g. completely remove instrumentation or
    trace code for production use).
    Where string values are being defined, the handling of quotes are
    likely to be subject to the specifics of your command shell
    environment, so you may need to experiment with quoting styles
    depending on your platform, or you may find the option
    =--define-from-module= more suitable for use.

  • =-define-from-module SOMEMODULE= --- will load the named module (as
    per the NodeJS =require()= function) and iterate all the exported
    properties of the module defining them as symbol names to be defined
    (as if by the =--define= option) per the name of each property
    (i.e. without the module name prefix) and given the value of the
    property. This is a much easier way to handle and document groups of
    symbols to be defined rather than a large number of =--define=
    options.

  • =--unsafe= --- enable other additional optimizations that are known to be
    unsafe in some contrived situations, but could still be generally useful.
    For now only these:

    • foo.toString() ==> foo+""
    • new Array(x,...) ==> [x,...]
    • new Array(x) ==> Array(x)
  • =--max-line-len= (default 32K characters) --- add a newline after around
    32K characters. I've seen both FF and Chrome croak when all the code was
    on a single line of around 670K. Pass --max-line-len 0 to disable this
    safety feature.

  • =--reserved-names= --- some libraries rely on certain names to be used, as
    pointed out in issue #92 and #81, so this option allow you to exclude such
    names from the mangler. For example, to keep names =require= and =$super=
    intact you'd specify --reserved-names "require,$super".

  • =--inline-script= -- when you want to include the output literally in an
    HTML == tag you can use this option to prevent =</script= from
    showing up in the output.

  • =--lift-vars= -- when you pass this, UglifyJS will apply the following
    transformations (see the notes in API, =ast_lift_variables=):

    • put all =var= declarations at the start of the scope
    • make sure a variable is declared only once
    • discard unused function arguments
    • discard unused inner (named) functions
    • finally, try to merge assignments into that one =var= declaration, if
      possible.

*** API

To use the library from JavaScript, you'd do the following (example for
NodeJS):

#+BEGIN_SRC js
var jsp = require("uglify-js").parser;
var pro = require("uglify-js").uglify;

var orig_code = "... JS code here";
var ast = jsp.parse(orig_code); // parse code and get the initial AST
ast = pro.ast_mangle(ast); // get a new AST with mangled names
ast = pro.ast_squeeze(ast); // get an AST with compression optimizations
var final_code = pro.gen_code(ast); // compressed code here
#+END_SRC

The above performs the full compression that is possible right now. As you
can see, there are a sequence of steps which you can apply. For example if
you want compressed output but for some reason you don't want to mangle
variable names, you would simply skip the line that calls
=pro.ast_mangle(ast)=.

Some of these functions take optional arguments. Here's a description:

  • =jsp.parse(code, strict_semicolons)= -- parses JS code and returns an AST.
    =strict_semicolons= is optional and defaults to =false=. If you pass
    =true= then the parser will throw an error when it expects a semicolon and
    it doesn't find it. For most JS code you don't want that, but it's useful
    if you want to strictly sanitize your code.

  • =pro.ast_lift_variables(ast)= -- merge and move =var= declarations to the
    scop of the scope; discard unused function arguments or variables; discard
    unused (named) inner functions. It also tries to merge assignments
    following the =var= declaration into it.

    If your code is very hand-optimized concerning =var= declarations, this
    lifting variable declarations might actually increase size. For me it
    helps out. On jQuery it adds 865 bytes (243 after gzip). YMMV. Also
    note that (since it's not enabled by default) this operation isn't yet
    heavily tested (please report if you find issues!).

    Note that although it might increase the image size (on jQuery it gains
    865 bytes, 243 after gzip) it's technically more correct: in certain
    situations, dead code removal might drop variable declarations, which
    would not happen if the variables are lifted in advance.

    Here's an example of what it does:

#+BEGIN_SRC js
function f(a, b, c, d, e) {
var q;
var w;
w = 10;
q = 20;
for (var i = 1; i < 10; ++i) {
var boo = foo(a);
}
for (var i = 0; i < 1; ++i) {
var boo = bar(c);
}
function foo(){ ... }
function bar(){ ... }
function baz(){ ... }
}

// transforms into ==>

function f(a, b, c) {
var i, boo, w = 10, q = 20;
for (i = 1; i < 10; ++i) {
boo = foo(a);
}
for (i = 0; i < 1; ++i) {
boo = bar(c);
}
function foo() { ... }
function bar() { ... }
}
#+END_SRC

  • =pro.ast_mangle(ast, options)= -- generates a new AST containing mangled
    (compressed) variable and function names. It supports the following
    options:

    • =toplevel= -- mangle toplevel names (by default we don't touch them).
    • =except= -- an array of names to exclude from compression.
    • =defines= -- an object with properties named after symbols to
      replace (see the =--define= option for the script) and the values
      representing the AST replacement value. For example,
      ={ defines: { DEBUG: ['name', 'false'], VERSION: ['string', '1.0'] } }=
  • =pro.ast_squeeze(ast, options)= -- employs further optimizations designed
    to reduce the size of the code that =gen_code= would generate from the
    AST. Returns a new AST. =options= can be a hash; the supported options
    are:

    • =make_seqs= (default true) which will cause consecutive statements in a
      block to be merged using the "sequence" (comma) operator

    • =dead_code= (default true) which will remove unreachable code.

  • =pro.gen_code(ast, options)= -- generates JS code from the AST. By
    default it's minified, but using the =options= argument you can get nicely
    formatted output. =options= is, well, optional :-) and if you pass it it
    must be an object and supports the following properties (below you can see
    the default values):

    • =beautify: false= -- pass =true= if you want indented output
    • =indent_start: 0= (only applies when =beautify= is =true=) -- initial
      indentation in spaces
    • =indent_level: 4= (only applies when =beautify= is =true=) --
      indentation level, in spaces (pass an even number)
    • =quote_keys: false= -- if you pass =true= it will quote all keys in
      literal objects
    • =space_colon: false= (only applies when =beautify= is =true=) -- wether
      to put a space before the colon in object literals
    • =ascii_only: false= -- pass =true= if you want to encode non-ASCII
      characters as =\uXXXX=.
    • =inline_script: false= -- pass =true= to escape occurrences of
      =</script= in strings

*** Beautifier shortcoming -- no more comments

The beautifier can be used as a general purpose indentation tool. It's
useful when you want to make a minified file readable. One limitation,
though, is that it discards all comments, so you don't really want to use it
to reformat your code, unless you don't have, or don't care about, comments.

In fact it's not the beautifier who discards comments --- they are dumped at
the parsing stage, when we build the initial AST. Comments don't really
make sense in the AST, and while we could add nodes for them, it would be
inconvenient because we'd have to add special rules to ignore them at all
the processing stages.

*** Use as a code pre-processor

The =--define= option can be used, particularly when combined with the
constant folding logic, as a form of pre-processor to enable or remove
particular constructions, such as might be used for instrumenting
development code, or to produce variations aimed at a specific
platform.

The code below illustrates the way this can be done, and how the
symbol replacement is performed.

#+BEGIN_SRC js
CLAUSE1: if (typeof DEVMODE === 'undefined') {
DEVMODE = true;
}

CLAUSE2: function init() {
if (DEVMODE) {
console.log("init() called");
}
....
DEVMODE && console.log("init() complete");
}

CLAUSE3: function reportDeviceStatus(device) {
var DEVMODE = device.mode, DEVNAME = device.name;
if (DEVMODE === 'open') {
....
}
}
#+END_SRC

When the above code is normally executed, the undeclared global
variable =DEVMODE= will be assigned the value true (see =CLAUSE1=)
and so the =init()= function (=CLAUSE2=) will write messages to the
console log when executed, but in =CLAUSE3= a locally declared
variable will mask access to the =DEVMODE= global symbol.

If the above code is processed by UglifyJS with an argument of
=--define DEVMODE=false= then UglifyJS will replace =DEVMODE= with the
boolean constant value false within =CLAUSE1= and =CLAUSE2=, but it
will leave =CLAUSE3= as it stands because there =DEVMODE= resolves to
a validly declared variable.

And more so, the constant-folding features of UglifyJS will recognise
that the =if= condition of =CLAUSE1= is thus always false, and so will
remove the test and body of =CLAUSE1= altogether (including the
otherwise slightly problematical statement =false = true;= which it
will have formed by replacing =DEVMODE= in the body). Similarly,
within =CLAUSE2= both calls to =console.log()= will be removed
altogether.

In this way you can mimic, to a limited degree, the functionality of
the C/C++ pre-processor to enable or completely remove blocks
depending on how certain symbols are defined - perhaps using UglifyJS
to generate different versions of source aimed at different
environments

It is recommmended (but not made mandatory) that symbols designed for
this purpose are given names consisting of =UPPER_CASE_LETTERS= to
distinguish them from other (normal) symbols and avoid the sort of
clash that =CLAUSE3= above illustrates.

** Compression -- how good is it?

Here are updated statistics. (I also updated my Google Closure and YUI
installations).

We're still a lot better than YUI in terms of compression, though slightly
slower. We're still a lot faster than Closure, and compression after gzip
is comparable., File, UglifyJS, UglifyJS+gzip, Closure, Closure+gzip, YUI, YUI+gzip, -----------------------------+------------------+---------------+------------------+--------------+------------------+----------, jquery-1.6.2.js, 91001 (0:01.59), 31896, 90678 (0:07.40), 31979, 101527 (0:01.82), 34646, paper.js, 142023 (0:01.65), 43334, 134301 (0:07.42), 42495, 173383 (0:01.58), 48785, prototype.js, 88544 (0:01.09), 26680, 86955 (0:06.97), 26326, 92130 (0:00.79), 28624, thelib-full.js (DynarchLIB), 251939 (0:02.55), 72535, 249911 (0:09.05), 72696, 258869 (0:01.94), 76584, ** Bugs?

Unfortunately, for the time being there is no automated test suite. But I
ran the compressor manually on non-trivial code, and then I tested that the
generated code works as expected. A few hundred times.

DynarchLIB was started in times when there was no good JS minifier.
Therefore I was quite religious about trying to write short code manually,
and as such DL contains a lot of syntactic hacks[1] such as “foo == bar ? a
= 10 : b = 20”, though the more readable version would clearly be to use
“if/else”.

Since the parser/compressor runs fine on DL and jQuery, I'm quite confident
that it's solid enough for production use. If you can identify any bugs,
I'd love to hear about them ([[http://groups.google.com/group/uglifyjs][use the Google Group]] or email me directly).

[1] I even reported a few bugs and suggested some fixes in the original
[[http://marijn.haverbeke.nl/parse-js/][parse-js]] library, and Marijn pushed fixes literally in minutes.

** Links

  • Twitter: [[http://twitter.com/UglifyJS][@UglifyJS]]
  • Project at GitHub: [[http://github.com/mishoo/UglifyJS][http://github.com/mishoo/UglifyJS]]
  • Google Group: [[http://groups.google.com/group/uglifyjs][http://groups.google.com/group/uglifyjs]]
  • Common Lisp JS parser: [[http://marijn.haverbeke.nl/parse-js/][http://marijn.haverbeke.nl/parse-js/]]
  • JS-to-Lisp compiler: [[http://github.com/marijnh/js][http://github.com/marijnh/js]]
  • Common Lisp JS uglifier: [[http://github.com/mishoo/cl-uglify-js][http://github.com/mishoo/cl-uglify-js]]

** License

UglifyJS is released under the BSD license:

#+BEGIN_EXAMPLE
Copyright 2010 (c) Mihai Bazon mihai.bazon@gmail.com
Based on parse-js (http://marijn.haverbeke.nl/parse-js/).

Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions
are met:

* Redistributions of source code must retain the above
  copyright notice, this list of conditions and the following
  disclaimer.

* Redistributions in binary form must reproduce the above
  copyright notice, this list of conditions and the following
  disclaimer in the documentation and/or other materials
  provided with the distribution.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDER “AS IS” AND ANY
EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER BE
LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY,
OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR
TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF
THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
SUCH DAMAGE.
#+END_EXAMPLE

主要指標

概覽
名稱與所有者mishoo/UglifyJS
主編程語言JavaScript
編程語言JavaScript (語言數: 3)
平台
許可證Other
所有者活动
創建於2012-08-27 09:31:03
推送於2024-11-22 08:11:38
最后一次提交2024-11-22 10:11:38
發布數345
最新版本名稱v3.19.3 (發布於 )
第一版名稱v2.0 (發布於 )
用户参与
星數13.3k
關注者數274
派生數1.2k
提交數3.4k
已啟用問題?
問題數2921
打開的問題數27
拉請求數2597
打開的拉請求數13
關閉的拉請求數422
项目设置
已啟用Wiki?
已存檔?
是復刻?
已鎖定?
是鏡像?
是私有?