Geoff Black's Forensic Gremlins | Everything that gives you fits in Digital Forensics and E-Discovery

November 21, 2010 by Geoff

NYC4SEC Meetup: Advanced Photo Forensics

NYC4SEC held a great Meetup on Wednesday to discuss image analysis and photo forensics… “Thanksgiving Meet-up: Let’s carve some data!” Professor Nasir Memon from NYU Poly came to give us some expert insight into the latest techniques in photo forensics. Before I give an overview of his talk, let me just say that NYC4SEC has been the best thing to happen to the NYC forensics community in the 4.5 years I’ve been here in the city. I’ve met a lot of great people working in our field through this Meetup group, some that I knew via email and message boards beforehand and some new. It’s also nice to see the students from John Jay’s computer forensics program coming to the Meetup to learn and meet industry experts.

Dr. Memon is a brilliant guy; in the past he served on the JPEG standard design committee which you can imagine leads to some very relevant experience for photo forensics. He gave us an overview of how SmartCarving works in Adroit Photo Forensics, a tool he helped design (which is awesome!). This wasn’t a sales pitch, though. He explained how the photo fragments are located and reassembled. This was interesting, but not as interesting as the work he’s done matching digital photos to their source. Memon went through all the various artifacts that digital photo capture devices leave behind. Every digital camera has physical imperfections in its hardware that leave a trail. Ballistics experts match a bullet to a gun by firing new bullets and comparing them to those left at crime scenes. In much the same way, photo experts can link a digital photo to a camera by taking new photos with that camera and comparing the noise patterns. There are two things that could be useful with this technique. (1) Creating a catalog of patterns from different types of cameras – the problem here is that sometimes different manufacturers use the same parts in their cameras. (2) Seizing a digital camera at a crime scene and being able to prove, definitively, that it took the pictures in question. It’s more than just a match to a Make and Model – it’s like DNA for cameras! The final item that Dr. Memon discussed was the ability to detect image manipulation in an automated fashion. This plays off the same basic theory of matching a camera to a photo. Basically he can detect if a portion of a photo has different noise patterns and then discern that pieces of the photo are not original, but doctored or added.

The NYC4SEC Meetups just keep getting better! We had a great turnout and I hope to see more industry professionals at the next one!

October 15, 2010 by Geoff

eDiscovery Review and Predictive Coding with Statistics

Anne Kershaw and Joe Howie recently wrote a great summary for LTN on predictive coding in eDiscovery. The article gives a brief history and the evolution of review and coding practices in discovery, then gets to the good stuff. They present some pretty compelling numbers from recent studies that show just how inconsistent review efforts can really be. This isn’t a technical deep dive article in predictive coding or statistics, but I hope it helps get the word out on what the Sedona Conference has been saying for a few years now.

I find their list of high points most interesting: Transparency, Replicability, Reevaluating production sets, Confidentiality, Shortened time lines. While you could say transparency is aided by the simple fact that predictive coding systems record more data from its users on how and why a document was coded one way or the other, the black box nature of the algorithms used to determine document links is still an issue for me. This probably won’t be changing any time soon unless consumers (attorneys and judges) demand it. Right now the amount of noise present in eDiscovery is so high that it is, perhaps, acceptable to give this a pass for the moment.

My absolute favorite quote from this article? Well, it has to do with why more attorneys aren’t using predictive coding:

Given the claimed advantages for predictive coding, why isn’t everyone using it? The most mentioned reason, cited by 10 respondents, was uncertainty or fear about whether judges will accept predictive coding. (Paradoxically, at a recent U.S. Magistrates’ Conference, a participant jurist asked for advice on how to convince lawyers to use this type of approach.)

This is in line with what I see in the industry every day. Despite eDiscovery education initiatives popping up in every legal conference, many attorneys still don’t seem to get it. Having sat through quite a bit of painful legal education in my time, I’ve seen a recurring issue with how new ideas are presented in the legal setting. Quoting David Alan Grier as Science Dude on tonight’s episode of Bones: “And what do we say about clarity? It’s barbarity that clarity is a rarity.” Just because you’re a good attorney, doesn’t mean you’re a good educator. This is true of so many professions…

Sampling, one of my favorite topics, is given a mention at the end of the article. I think many litigation shops fall back into old habits too easily, and forget how much time and money proper sampling can save them. Not just in processing and review, but also in time [not] wasted in the courtroom. Jason R. Baron and Ralph Losey literally talk about it all the time.

If you’re interested in learning more about statistical sampling for eDiscovery and how you can save us all a big headache (and maybe some money, too), head on over to my Presentations page and grab a copy of Statistical Validation And Data Analytics In eDiscovery, a talk I gave at IQPC eDiscovery West 2010 in San Francisco earlier this year. Let me know how you use sampling or predictive coding and how we can better educate the community in the comments!

September 28, 2010 by Geoff

Creating a COM Accessible DLL in Visual C# for EnScript

A while back, I came across a thread on the EnCase EnScript forum asking for assistance in getting EnScript to recognize the interfaces for a DLL created in C#.Net through COM. I created a self-contained C# demo project to help out. The original code can be browsed on github for branch “v0.1”. This post is an update to the information in that thread with extra references and an update on 64-bit OS usage. I recently updated the code which, as of this posting, can be found on github for branch “v0.2” or on the EnScripts page of the website. If you’re still interested, please download the code and follow along. If you’re a glutton for punishment, feel free to read all the links in their full MSDN glory.

In the scenario, the same DLL would work just fine when accessing its methods and properties through VBScript, but EnScript wouldn’t “see” any of them. This wasn’t so much an EnScript or, necessarily, a .NET problem. It’s actually an issue with the early binding that EnCase requires in combination with the lack of registration on the part of .NET.

Making the Code Work:
There are, however, a few things to take care of in the code before worrying about registration. Your interface should be set to InterfaceIsDual, your class interface type can be set to None, and you should set ComVisible to true on both of them. In the code below, you would replace InterfaceGUID and ClassGUID with your own previously generated GUID values. If you’ve worked with Visual Studio for any length of time, you’ll soon discover that you should generate your own GUIDs ahead of time using guidgen.exe and manually assign them, otherwise Visual Studio will create new GUIDs every time you run Regasm.exe (which is annoying). The below code can be found in the demo in CDemo.cs.

[Guid("InterfaceGUID"), InterfaceType(ComInterfaceType.InterfaceIsDual)] [ComVisible(true)] public interface _COMDemo { ... } [Guid("ClassGUID"), ClassInterface(ClassInterfaceType.None)] [ComVisible(true)] public class COMDemo : _COMDemo { ...

Before we get any further, I should also mention in passing that it’s always best to use Strong-Named Assemblies.

After your code is all set, we need to deal with the registration issues. The first thing we need to do is register our DLL in the Global Assembly Cache using Gacutil.exe or a capable installer:

gacutil.exe /i CDemoLib.dll /f

Next, we’ll register the assembly for COM access using Regasm.exe:

regasm.exe CDemoLib.dll /tlb:CDemoLib.tlb

Creating the TypeLib using Regasm.exe generates a default interface with no methods. VBScript makes only late-bound calls into the assembly; it doesn’t use the TypeLib so it doesn’t care that the default interface is empty. Here’s what that gets us:

Regasm.exe also fails to add all of the keys in the registry that are required for early-binders and these can be hard to track down. Take a look at the keys that are missing below.

Windows Registry Editor Version 5.00 [HKEY_CLASSES_ROOT\CLSID\{ClassGUID}\Control] [HKEY_CLASSES_ROOT\CLSID\{ClassGUID}\MiscStatus] @="131457" [HKEY_CLASSES_ROOT\CLSID\{ClassGUID}\Typelib] @="{TypelibGUID}" [HKEY_CLASSES_ROOT\CLSID\{ClassGUID}\Version] @="1.0"

HKEY_CLASSES_ROOT is an alias for HKEY_LOCAL_MACHINE\SOFTWARE\Classes. These keys are essential for the methods and properties to be COM accessible to early-binders. The Control key identifies our object as an ActiveX control. The MiscStatus key takes its values from the OLEMISC enumeration. The Typelib key points to the GUID of the Typelib, which is the GuidAttribute of the assembly (found in AssemblyInfo.cs). And Version is pretty straightforward. This article mostly applies, despite its age. After adding our custom entries:

After you’ve thrown in all the extras, you should be ready to go with EnScript! Fire up EnCase and take a look at the sample EnScript that’s included with the code. The first thing you’ll notice at the top is the typelib instruction.

typelib aCDemoClass "CDemoLib.COMDemo"

This tells EnCase to retrieve the CLSID for the assembly from HKEY_CLASSES_ROOT\CDemoLib.COMDemo\CLSID, then locate the CLSID at HKEY_CLASSES_ROOT\CLSID\{ClassGUID} and import the TypeLib specified. If we got everything right, you should see this in the EnScript Types tab of EnCase:

Magic!

You can see the properties and methods are showing up just fine. Next we’re going to declare our variable using the newly imported class and call Create() to instantiate the object.

aCDemoClass::COMDemo acd; ... acd.Create();

That’s it! You can see from the demo code that we can set and retrieve the values of our properties:

Console.WriteLine("Value1: " + acd.Value1()); ... acd.SetValue1(1000); ... Console.WriteLine("Value1: " + acd.Value1());

Output:
“Value1: 0”
“Value1: 1000“

And we can also utilize the methods from our DLL:

Console.WriteLine("acd.PlusFive(7): " + acd.PlusFive(7));

Output: “acd.PlusFive(7): 12“

64-bit
Recently an acquaintance was having issues with getting the project to work in a 64-bit OS, so I updated the example registration files for version 0.2. The only real difference is that you run the 64-bit version of Regasm.exe from the Framework64 directory. This inserts the necessary info for use with the 64-bit version of EnCase. The same registry keys we inserted above still apply because Windows aliases the proper section of the registry for 64-bit usage. The latest version of the code updates the project to Visual Studio 2010 and I’ve confirmed testing for .NET Framework versions 2 and 4 with the new project.

Got Errors?
If you forgot to perform any registration at all, you’ll probably see, “Expecting ‘field or method declaration’, Custom dotNet COM Object(13,13).”

Depending on the version of Regasm.exe used, if you don’t add the custom registry keys, when you run the script you might see a “Class not registered” error or something similar.

Anything I missed? Drop me a line in the comments.

September 16, 2010 by Geoff

NYC4SEC Meetup with Ovie Carroll

I just got back from the NYC4SEC Meetup held at Pace University. It was a welcome opportunity to see old colleagues and friends and meet many people in the industry with whom I’ve conversed electronically. The turnout was great – probably about 30 people – and it was definitely a success! It’s great to be able to swap war stories and share experiences with peers in the forensics world.

The speaker tonight was Ovie Carroll, Director for the Cybercrime Lab at the Department of Justice, Computer Crime and Intellectual Property Section (CCIPS) and an adjunct professor at George Washington University. Ovie was in town to teach the SANS Forensics 408 course and agreed to stop by and speak to our group. Ovie is a fantastically engaging speaker with a quick wit and more jokes than Steve Martin. I think it’s fare to say the crowd, which represented diverse experience levels, thoroughly enjoyed his presentation. His style of presentation really brings the audience into the conversation.

Ovie touched on a lot of things from basic knowledge on where to look for physical pieces of evidence to very specific artifacts. He showed some intelligent timeline presentation graphics and underscored the value of trace pieces of evidence by discussing a prominent case he dealt with and how they tracked the suspect down from a seemingly innocuous screenshot.

The overarching theme of his presentation was that forensic analysts have to be smarter about how they approach different problems and also how they interface with counsel or those requesting the analysis. He discussed the need to develop an investigative plan, just as one would do with non-digital evidence and investigations. This plan should involved counsel as early as possible and you should push back on those that say, “Just give me everything!” (My addition: As experts this is exactly what we get paid to do – not just to nod and go digging for everything, but to tell the client and counsel what our expert experience says they should be looking for and what its value is!!!)

There’s often a first reaction for analysts to say, “I can’t do anything with this drive yet, it’s not imaged!,” and Ovie would say that’s dead wrong. He pushed the idea of triage quite a bit, pointing out that you might not get all the benefits of a full forensic exam, but you might get just enough to have a conversation with the subject of your investigation and bring more value in the process. Triage is not going to put an end to forensic exams, it’s just another tool that gets you a few pieces quickly. The key to making this useful is that it must produce output that’s easy to understand for non-technical people, otherwise the usefulness of fast data retrieval declines rapidly.

The other pain point Ovie listed is one I’m sure everyone feels: an overwhelming amount of evidence to deal with. For law enforcement, more and more digital evidence is being seized as officers and prosecutors become more familiar with the value of digital evidence; for corporate investigators, enterprises are adding more varied devices to their technology lineups that analysts need to keep on top of. This is an issue everyone has been dealing with for a while now and Ovie, like all of the rest of us, is searching for the light at the end of the tunnel. Of course, no good digital forensics presentation would be complete without the Find All Evidence button!

Thanks to Ovie for his great presentation, to Doug Brush for organizing, to J-Michael Roberts and his wife for the delicious USB dongle-shaped cake (!!!), and to Pace University for hosting us! Joe Garcia is trying to line up some great folks from the industry for upcoming meetups. I’m sure everyone is looking forward to our next great get together, so keep an eye on the schedule and please join the Meetup group to stay informed!