Thicket Family of Source Code Obfuscators

Semantic Designs offers robust technology and a wider variety of language obfuscators than any other company, making us the premier supplier of source code obfuscator tools.

Concept

A source code obfuscator accepts a program source file, and generates another functionally equivalent source file which is much harder to understand or reverse-engineer. This is useful for technical protection of intellectual property when:

  • source code must be delivered for public execution purposes (with interpretive languages like ECMAScript in web pages)
  • commerical components must be delivered in source form for direct integration by a customer into his end product (portable applications in C or PHP etc., code libraries or hardware components coded in Verilog or VHDL)
  • sending test cases derived from proprietary code to vendors (code triggering buggy behavior, etc.)
  • object code still contains too many clues (such as class public methods used only inside an application, as with Java class files)

Semantic Designs' Obfuscation tools generally strip comments, remove nice indentation and whitespace, encode constants in inconveniently readable ways, and rename identifiers in source (for variables, functions, etc.) from their original (presumably self-explanatory) name to nonsense names that convey no information. A programmer-supplied file controls whether certain names are preserved (to ensure continued access to published external APIs). This makes it straightforward for application development to continue, and yet still easily obfuscate new application versions.

Our obfuscators typically turn small fragments of readable source code (JavaScript example):

for (i=0; i < M.length; i++){
   // Adjust position of clock hands
   var ML=(ns)?document.layers['nsMinutes'+i]:ieMinutes[i].style;
   ML.top=y[i]+HandY+(i*HandHeight)*Math.sin(min)+scrll;
   ML.left=x[i]+HandX+(i*HandWidth)*Math.cos(min);
 }


into this:

for(O79=0;O79<l6x.length;O79++){var O63=(l70)?document.layers["nsM
\151\156u\164\145s"+O79]:ieMinutes[O79].style;O63.top=l61[O79]+O76+(O79*O75)
*Math.sin(O51)+l73;O63.left=l75[O79]+l77+(O79*l76)*Math.cos(O51);}

It still does exactly the same thing as the original, but it is far more difficult to guess what the variables do. If a thief doesn't know the intended purpose of the variables, he can hardly claim it as his own, let alone modify it in competitive ways. Some names are purposely not scrambled because access to public APIs must be preserved. For most languages (as in this case), indentation and line breaks are removed, to make it that much more inconvenient to analyze. The larger the code, the uglier the obfuscated result gets (see sample codes for language-specific Obfuscators via the corresponding product web page via links below), and it gets correspondingly difficult to reverse engineer.

Warning: obfuscators do not stop reverse-engineering efforts by really determined opponents. In fact, no protection scheme will. Rather, obfuscators are like good locks on bank vaults, in that they stop most thieves because the work is simply too much trouble, and immediately frustrate the amateurs that try. Obfuscation can also help signal your clear legal intent to protect your code; a thief must first actively de-obfuscate it, and that act alone can be used an indicator of his intentions versus those of honest users.

Technology

Many conventional obfuscation tools use ad hoc string processing methods to carry out the obfuscation process, or operate on binary files. String processing can sometimes work, but it often fails when multiple statements per line, nested comments, comments around incomplete blocks of code or keywords, obscure language features such as escapes in quoted strings, odd casing conventions for names, etc. are encountered, as they always are in large systems of software. This is because programming languages have complex formation and name resolution rules, and so processing them reliably usually requires a complete language parse, not string hacking. Failure to process the program correctly could produce an obfuscated version that is broken. Binary obfuscators can work quite well, but are generally limited to a standard simple binary formats (e.g., Java class files, not Win32 object files).

The most reliable way to build an obfuscator is to parse the source language according to the source language lexical and syntax rules into compiler-like data structures, carry out the obfuscation, and then unparse back to source text. This ensures that the syntax structures found match those of the language. All of SD's obfuscators are built as extensions of source code formatters, based on DMS's ability to parse and prettyprint source files, and are based on the language definition modules used to drive DMS for large scale software reengineering tasks.

The obfuscator is applied to a set of files at once, and obfuscates them all consistently. Each obfuscator run produces an obfuscation map showing how identifiers were scrambled, as reference, or to ensure that later obfuscations scramble symbols in the same way. This might be used to ship obfuscated source updates to customers.

All SD's obfuscators are designed to operate as command-line style programs, to enable inclusion in scripts used for production purposes. There is a GUI available for initial configuration and operation. Consistent handling of formatting switches and I/O conventions across formatters and obfuscators aid software engineering staff when handling the multiple languages typically used by an organization.

SD's obfuscators are presently available on Windows 2003, XP, or later operating systems.

Available Obfuscators

SD offers a family of obfuscators based on DMS. Presently available are:

Download an evaluation version

What Customers Are Saying

"Your obfuscator is exactly what we were looking for! Even if a 'prettyfier' were used it would be terminally painful for me to reverse engineer code I wrote myself after running it through Semantic Design's tool. Semantic Designs clearly put lots of thought into it."
-- Eric Derbez

Custom Obfuscation Options

Semantic Designs can build custom obfuscators with special features:

  • FORTRAN or other standard programming languages
  • Unusual or custom languages or dialects
  • Encryption of literal strings (removes easy clues from binary dumps)
  • Program transformations to scramble logic while preserving function
  • Watermarks and hidden copyrights
For more information: info@semanticdesigns.com    Follow us at Twitter: @SemanticDesigns


Thicket Family of Source Code Obfuscators