Finding code clones is far from trivial. For a nice explanation about the challenges of finding clones in source code, I suggest reading: "Comparison and evaluation of code clone detection techniques and tools: a qualitative approach". You can start by reading sections 2 and 6.
There are at least two tools with source code available under permissive free software licenses. Deckard and CCFinderX. CCFinderX is an evolution of CCFinder, made by the same author.
Deckard is easy to build and test, however CCFinder was not. For Linux users, I made a fork from gpoo/ccfinderx, and made some changes to simplify the build on Linux. The main change is to separate the core from the GUI. I have now two repositories:
Both are clones from gpoo/ccfinderx, but I've spited things. At the moment the GUI do not work but you can build CCFinderX-core without wired OpenJDK dependencies. And it works producing textual output. My goal with this repositories is provide source code that can be compiled, packed, and distributed.