java - What is a suitable lexer generator that I can use to strip identifiers from many language source files? -


I am working on a group project for my university which is being used for plagiarism in computer science .

My group is primarily discontinuing the hashing / fingerprinting technology described in this magazine article:. This is similar to how the plagiarizing detection system works.

We are basically taking the Kashmiri palm of the source code of fellow students and looking at them in a database for related matches (together with many adaptations, we determine that the document What hashes to choose as fingerprints).

The first aspect of our project is its "front-end" section, which will have some semantic knowledge about each file format that our detection system process This will allow us to strip some details from the document which we do not want for the purpose of detecting plagiarism. Actually we want to be able to rename all the variables in different programming languages ​​for a different string or letter.

What is a light solution (laser generator or something) that we can use to help change the name of all the variables in the source code files in different languages?

Our project is being written in Java

Ideally I just want to be able to define grammar for each language and then end our front in all those sources All identifiers will be able to rename which are in some constants. We will do this for each file format that we would like to support (java, c ++, python, etc.)

For a laser / parser generator, you should see the ANLR. TXL, which is a literal change lecturer, also has a look. Grammar prepared should be available for both.


Comments

Popular posts from this blog

oracle - The fastest way to check if some records in a database table? -

php - multilevel menu with multilevel array -

jQuery UI: Datepicker month format -