Difference between revisions of "CTF: Field Guide"

Revision as of 10:13, 5 February 2023

Sumber: https://trailofbits.github.io/ctf/

CTF Field Guide

“Knowing is not enough; we must apply. Willing is not enough; we must do.” - Johann Wolfgang von Goethe

Welcome!

We’re glad you’re here. We need more people like you.

If you’re going to make a living in defense, you have to think like the offense.

So, learn to win at Capture The Flag (CTF). These competitions distill major disciplines of professional computer security work into short, objectively measurable exercises. The focus areas that CTF competitions tend to measure are vulnerability discovery, exploit creation, toolkit creation, and operational tradecraft.

Whether you want to succeed at CTF, or as a computer security professional, you’ll need to become an expert in at least one of these disciplines. Ideally in all of them.

That’s why we wrote this book.

In these chapters, you’ll find everything you need to win your next CTF competition:

Walkthroughs and details on past CTF challenges Guidance to help you design and create your own toolkits Case studies of attacker behavior, both in the real world and in past CTF competitions To make your lives easier, we’ve supplemented each lesson with the Internet’s best supporting reference materials. These come from some of the best minds in the computer security field. Looking ahead, we hope you’ll collaborate to keep this book evolving with the industry.

We’ve tried to structure this so you can learn as quickly as you want, but if you have questions along the way, contact us. We’ll direct your question to the most relevant expert. If there’s enough demand, we may even schedule an online lecture.

Now, to work.

-The Trail of Bits Team

Capture the Flag

Why CTF?

Computer security represents a challenge to education due to its interdisciplinary nature. Topics in computer security are drawn from areas ranging from theoretical aspects of computer science to applied aspects of information technology management. This makes it difficult to encapsulate the spirit of what constitutes a computer security professional.

One approximation for this measure has emerged: the ‘capture the flag’ competition. Attack-oriented CTF competitions try to distill the essence of many aspects of professional computer security work into a single short exercise that is objectively measurable. The focus areas that CTF competitions tend to measure are vulnerability discovery, exploit creation, toolkit creation, and operational tradecraft.

A modern computer security professional should be an expert in at least one of these areas and ideally in all of them. Success in CTF competitions demands that participants be an expert in at least one and ideally all of these areas. Therefore, preparing for and competing in CTF represents a way to efficiently merge discrete disciplines in computer science into a focus on computer security.

Find a CTF

If you ever wanted to start running, you were probably encouraged to sign up to a 5k to keep focused on a goal. The same principle applies here: pick a CTF in the near future that you want to compete in and come up with a practice schedule. Here are some CTFs that we can recommend:

PicoCTF and PlaidCTF by CMU
HSCTF is made for high school students
Ghost in the Shellcode (GitS)
CSAW CTF by NYU-Poly
UCSB iCTF is for academics only
Defcon CTF

Visit CTF Time and the CapCTF calendar for a more complete list of CTFs occuring every week of the year.

How is a Wargame different?

Wargames are similar to a CTF but are always ongoing. Typically, they are organized into levels that get progressively harder as you solve more of them. Wargames are an excellent way to practice for CTF! Here are some of our favorites:

Micro Corruption
SmashTheStack
OverTheWire
Exploit Exercises

What about CCDC?

There are some defense-only competitions that disguise themselves as CTF competitions, mainly the Collegiate Cyber Defense Challenge (CCDC) and its regional variations, and our opinion is that you should avoid them. They are unrealistic exercises in frustration and will teach you little about security or anything else. They are incredibly fun to play as a Red Team though!

Find a Job

Career Cheatsheet [Editor's note: this is an older article written for pentest.cryptocity.net and that we are in the process of updating.]

These are my views on information security careers based on the experience I've had and your mileage may vary. The information below will be most appropriate if you live in New York City, you're interested in application security, pentesting, or reversing, and you are early on in your career in information security.

Employers Roles Learn from a Book Learn from a Course University Communication Meet People Conferences Certifications Links Friends of the Guide Employers As far as I can tell, there are five major employers in the infosec industry (not counting academia).

The Government Non-Tech Fortune 500s (mostly finance) Big Tech Vendors (mostly West coast) Big Consulting (mostly non-technical) Small Consulting (mostly awesome) The industry you work in will determine the major problems you have to solve. For example, the emphasis in finance is to reduce risk at the lowest cost to the business (opportunities for large-scale automation). On the other hand, consulting often means selling people on the idea that X is actually a vulnerability and researching to find new ones.

Roles I primarily split up infosec jobs into internal network security, product security, and consulting. I further break down these classes of jobs into the following roles:

Application Security (code audits/app assessments) Attacker (offensive) Compliance Forensics Incident Handler Manager Network Security Engineer Penetration Tester Policy Researcher Reverse Engineer Security Architect The roles above each require a different, highly specialized body of knowledge. This website is a great resource for application security and penetration testing, but you should find other resources if you are interested in a different role.

Learn from a Book Fortunately, there are dozens of good books written about each topic inside information security. Dino Dai Zovi and Tom Ptacek both have excellent reading lists. We recommend looking at:

Gray Hat Hacking The Myths of Security Hacking: The Next Generation and any book from O'Reilly on a scripting language of your choice If you're not sure what you're looking for, then you should browse the selection offered by O'Reilly. They are probably the most consistent and high-quality book publisher in this industry.

Don't forget that reading the book alone won't give you any additional skills beyond the conversational. You need to practice or create something based on what you read to really gain value and understanding from it.

Learn from a Course If you're looking for something more hands-on and directed, there are lots of university courses about information security available online. I listed some of the best ones that have course materials available below (ordered by institution name). The RPI course is the most similar to this one and Hovav gets points for the best academic reading list, but every course on this list is fantastic.

Course Instructor(s) Institution Secure Software Principles RPISEC RPI Modern Binary Exploitation RPISEC RPI Computer Security various Berkeley Computer and Network Security Dan Boneh Stanford Web Programming and Security Dan Boneh Stanford Intro to Web Application Security Edward Z. Yang MIT Intro to Software Exploitation Nathan Rittenhouse MIT UNIX Security Holes D. J. Bernstein UIC Malware Analysis and Antivirus Technologies various TML System Security and Binary Code Analysis Zhiqiang Lin UT Dallas Cybersecurity Specialization various UMD Graduate Computer Security Hovav Shacham UCSD University The easiest shortcut to finding a university with a dedicated security program is to look through the NSA Centers of Academic Excellence (NSA-COE) institution list. This certification has become watered down as more universities have obtained it and it might help to focus your search on those that have obtained the newer COE-CO certification. Remember, certifications are only a guideline. You should look into the actual programs at each university instead of basing your decision on a certification alone.

Once in university, take classes that force you to write code in large volumes to solve hard problems. IMHO the courses that focus on mainly theoretical or simulated problems provide limited value. Ask upper level students for recommendations if you can't identify the CS courses with programming from the CS courses done entirely on paper. The other way to frame this is to go to school for software development rather than computer science.

Capture the Flag

If you want to acquire and maintain technical skills and you want to do it fast, then you should play in a CTF or jump into a wargame. The one thing to note is that many of these challenges attach themselves to conferences (of all sizes), and by playing in them you will likely miss the entire rest of the conference. Try not to over do it, since conferences are useful in their own way (see the rest of the career guide).

There are some defense-only competitions that disguise themselves as normal CTF competitions, mainly the Collegiate Cyber Defense Challenge (CCDC) and its regional variations, and my opinion is that you should avoid them. They are exercises in system administration and frustration and will teach you little about security or anything else. They are incredibly fun to play as a Red Team though.

Communication

In any role, the majority of your time will be spent communicating with others, primarily through email and meetings and less by phone and IM. The role/employer you have will determine whether you speak more with internal infosec teams, non-security technologists, or business users. For example, expect to communicate more with external technologists if you do network security for a financial firm.

Tips for communicating well in a large organization:

Learn to write clear, concise, and professional email.
Learn to get things done and stay organized. Do not drop the ball.
Learn the business that your company or client is in. If you can speak in terms of the business, your arguments a) to not do things b) to fix things and c) to do things that involve time and money will be much more persuasive.
Learn how your company or client works, ie. key individuals, processes, or other motivators that factor into what gets things done.

If you are still attending a university, as with CS courses, take humanities courses that force you to write.

Meet People

Find and go to your local CitySec, an informal meetup without presentations that occurs once monthly in most cities. At Trail of Bits, we attend our local NYSEC.

ISSA and ISC2 focus on policy, compliance and other issues that will be of uncertain use for a new student in this field. Similarly, InfraGard mainly focuses on non-technical law enforcement-related issues. OWASP is one of the industry's worst examples of vendor capture and is less about technology and more about sales.

Conferences

If you've never been to an infosec conference before, use the google calendar below to find a low-cost local one and go. There have been students of mine who think that attending a conference will be some kind of test and put off going to one for as long as possible. I promise I won't pop out of the bushes with a final exam and publish your scores afterward.

Information Security Conferences Calendar

If you go to a conference, don't obsess over attending a talk during every time slot. The talks are just bait to lure all the smart hackers to one location for a weekend: you should meet the other attendees! If a particular talk was interesting and useful then you can and should talk to the speaker. This post by Shawn Moyer at the Defcon Speaker's Corner has more on this subject.

If you're working somewhere and are having trouble justifying conference attendance to your company, the Infosec Leaders blog has some helpful advice.

Certifications

This industry requires specialized knowledge and skills and studying for a certification exam will not help you gain them. In fact, in many cases, it can be harmful because the time you spend studying for a test will distract you from doing anything else in this guide.

That said, there are inexpensive and vendor-neutral certifications that you can reasonably obtain with your current level of experience to help set apart your resume, like the Network+ and Security+ or even a NOP, but I would worry about certifications the least in your job search or professional development.

In general, the two best reasons to get certifications are:

If you are being paid to get certified, through paid training and exams or sometimes through an automatic pay raise after you get the certification (common in the government). If your company or your client is forcing you to get certified. This is usually to help with a sales pitch, ie. "You should hire us because all of our staff are XYZ certified!" In general, it is far more productive to spend time playing in a CTF, then using your final standing as proof that you're capable.

Links Reddit and Hacker News threads about this post Security Advice

How to Break Into Security, Ptacek Edition VRT: How to Become an Infosec Expert, Part I Five pieces of advice for those new to the infosec industry How to Milk a Computer Science Education for Offensive Security Skills Kill Your Idols, Shawn Moyer's reflections on his first years at Defcon Thoughts on Certifications

My Canons of (ISC)2 Ethics Not a CISSP (ISC)2's Newest Cash Cow Why You Should Not Get a CISSP General Tech Advice Advice for Computer Science College Students Don't call yourself a programmer, and other career advice The answer to "Will you mentor me?" is .... no.

Vulnerability Discovery

Auditing Source Code This module is about getting familiar with vulnerabilities that manifest in applications that compile to native code. An accurate and complete understanding of an application written in a compiled language cannot be achieved without learning about how the compiler transforms source to machine code and how processors execute that code. An easy way to gain experience with these transforms is by reverse engineering compiled variants of your own code or of projects with source code available. At the end of this module you will be able to identify vulnerabilities in compiled languages like C and C++.

Vulnerabilities are commonly identified in large software packages due to their use of third-party software libraries. Common examples include libraries like libxml, libpng, libpoppler, and libfreetype that parse complicated file formats and protocols. Each of these libraries have historically been prone to software flaws that make the applications using them vulnerable to attack. It doesn't help that most software packages fail to update these libraries when new versions come out, making it significant easier to find known vulnerabilities to apply in these cases.

Lecture Source Code Auditing I Source Code Auditing II Workshop In order to practice your skills, we recommend going through the process of identifying as many vulnerabilities as you can in an intentionally vulnerable toy application and then moving on to a real application and doing the same.

The Newspaper application is a small server written in C that allows authenticated users to read and write articles to a remote file system. Newspaper is written in such a way that it is vulnerable to many different attacks. You should be capable of identifying at least 10 bugs or potential vulnerabilities in this code.

Newspaper App Newspaper App Installer Wireshark, however, is an industry standard network protocol analyzer that has been under continuous development since 1998. Vulnerabilities in this code base are much fewer and far between than in the Newspaper app however many still exist. Take a look at the wireshark security page, find the name of a protocol dissector and see if you can independently discover the vulnerability without looking at the details. Dissectors are located in /epan/dissectors/ folder.

Wireshark 1.8.5 Tools When auditing, it is helpful to use a tool design to profile and navigate the codebase. Below are two tools, Source Navigator and Understand, designed to help analysts get familiar with code quickly by collecting and displaying information about data relationships, API usage, design patterns and control flow. An example of a useful diffing tool is also listed below. One example of a free, open source code auditing tool is the Clang Static Analyzer, which can help you track down programming errors in common APIs and vulnerable programming patterns.

Source Navigator Scitools Understand Meld Clang Static Analyzer Resources Make sure you’re intimately familiar with the internals of the languages you target for analysis. Vulnerabilities are found by auditors who are more familiar with the language and the codebase than the programmers that originally developed it. In some cases, this level of understanding can be achieved simply by paying attaching to optional compiler warnings or through the use of third-party analysis tools that help track down common programming errors. Computer security is tantamount to computer mastery. Without a rigorous understanding of your targets you can never hope to defeat them.

Essential C - Programming in C primer TAOSSA Chapter 6: C Language Issues - Strongly recommended reading Integer Overflow Wireshark Security - Examples of lots of vulnerablities Gera's Insecure Programming by Example - Examples of small vulnerable C programs

Auditing Binaries You’ve made it all the way down to the native layer, this is what software is after you pull off all the covers. The flavor of native code we’re going to focus on today is 32-bit Intel x86. Intel processors have been a powerful force in personal computing since the 80’s and currently predominate desktop and server market. Understanding this instruction set will give you some insight into how the programs you use every day operate as well as provide a reference point for when you encounter other instruction sets like ARM, MIPS, PowerPC and SPARC.

This module is about becoming familiar with the native layer and developing strategies for understanding, analyzing and interpreting native code. By the end of this module you should be capable of performing a “reverse compilation” -- going from assembly fragments to statements in higher level languages -- and, in the process, deriving meaning and programmer intent.

Lecture

Learning x86 can appear daunting at first and requires some dedicated study to master. We recommend reading Chapter 3 of "Computer Systems: A Programmer's Perspective" to learn how C programs get compiled into machine code. Once you you have some basic, working knowledge of this process then keep a handy reference guide around like the x86 Assembly Guide from the University of Virginia. We've found this video series from Quinn Liu to be a quick and painless introduction too.

CS:APP Chapter 3: Machine-Level Representation of Programs
x86 Assembly Guide
Introduction to x86 Assembly

Workshop

The following programs are both “binary bombs.” Reverse engineer the following linux programs to identify the series of inputs that will “defuse” the bomb. Each successive level of the bomb focuses on a different aspect of native code. For example, in the lab from CMU you will encounter different data structures (linked lists, trees) as well as how control flow structures (switches, loops) manifest in native code. While reversing these programs you may find it useful to keep track of your progress by transforming what you see into C or another high level language.

You should aim to solve at least eight stages between the two labs. The CMU bomb lab has a secret phase and the RPI bomb lab has a phase that involves memory corruption, can you find and solve them?

CMU Binary Bomb Lab RPI Binary Bomb Lab Tools The two essential tools for working with native code are the debugger and the disassembler. We recommend you become familiar with the industry standard disassembler: IDA Pro. IDA will divide code into discrete chunks corresponding to the functions defined in the program's source code. Each function is then further subdivided into "basic blocks" defined by instructions that alter control flow. This makes it easy to identify loops, conditionals, and other control flow constructs at a glance.

Debuggers allow you to interact with and examine the state of running code by setting breakpoints and examining memory and register contents. You may find this useful as a sanity check if you are not seeing the results you expect your input to produce but be alert, some programs employ anti-debugger techniques and can modify program behavior in the presence of a debugger. The GNU Debugger (gdb) is the standard debugger for most linux systems. gdb can be acquired through the package manager in your chosen linux distribution.

IDA Pro Demo gdb Resources Many good resources exist for learning x86 assembly and the various tricks employed in capture the flag exercises. In addition to the resources above, the x86 Wikibook and the AMD instruction set manual are more complete reference guides you may want to refer to (we find the AMD manual can be less daunting than the corresponding manual from Intel).

AMD64 Programmer's Manual: General-Purpose and System Instructions x86 Assembly Wikibook Computer Systems: A Programmer's Perspective (CS:APP) Some of the tools used for reverse engineering can be as complicated as assembly language itself. Cheatsheets that list out common commands and use cases can be helpful.

gdb Quick Reference IDA Quick Reference WinDBG x86 Cheat Sheet Finally, many capture the flag challenges will make use of anti-debugging and anti-disassembly techniques to hide or obfuscate the goal. Several of these techniques are employed by the bomb labs but you may want a more complete reference.

Linux anti-debugging techniques The "Ultimate" Anti-Debugging Reference

Auditing Web Applications Welcome to the web hacking module. This module is about getting familiar with vulnerabilities commonly found in web applications. At the end of this module you will be able to identify common vulnerabilities in web based applications using a variety of testing methodologies and source level auditing. The lecture material will give you all the tools you need to successfully audit the workshop material.

Lecture Web Hacking Part I Web Hacking Part II Workshop In order to practice your skills, we recommend going through the process of finding and exploiting vulnerabilities in the Damn Vulnerable Web App (DVWA) and the Siberia Exploit Kit. DVWA is a collection of vulnerable test cases implemented in PHP and serves as an easy introduction to the many things that can go wrong in web applications. The Siberia Exploit Kit is a "crimeware pack" used by criminals to perform massive compromises. It includes a package of browser exploits and a command and control panel intended to manage compromised hosts. Siberia contains several pre- and post-authentication vulnerabilities that allow an attacker to gain administrative access to the panel, then take over the server on which it is hosted.

Download and run the OWASP Broken Web Apps virtual machine in VMware to start this workshop. BWA includes many web applications many for security testing, including DVWA. Once you have mastered DVWA, feel free to move on to other vulnerable web applications! Try auditing Siberia's source code to find the vulnerabilities, paying attention to sources of input in PHP.

OWASP Broken Web Apps Siberia Crimeware Pack (pw: infected) Tools Burp Suite is a local HTTP proxy intended for security testing. Burp Suite is made for web penetration testers and simplifies many common tasks in a point-and-click GUI. The features available in the free version are more than enough to complete this and many other web security challenges.

Burp Suite Resources Many simple testing methods and common web application flaws are available in the walkthrough. Ensure that you understand the fundementals of HTTP, HTML, and PHP to do well on this section.

OWASP Top 10 Tools and Tactics The Tangled Web: Chapter 3 PHP Primer

Exploit Creation

Exploiting Binaries 1 Binary exploitation is the process of subverting a compiled application such that it violates some trust boundary in a way that is advantageous to you, the attacker. In this module we are going to focus on memory corruption. By abusing vulnerabilities that corrupt memory in software we can often rewrite critical application state information in a way that allows us to elevate privileges inside the context of a particular application (like a remote desktop server) or perform arbitrary computation by hijacking control flow and running code of our choosing.

If you're trying to find bugs in compiled C programs, it's important to know what you're looking for. Start with identifying where the data you send to the program is used. If your data is stored in a buffer, take note of the sizes of them. Programming in C without errors is very difficult and the CERT C Coding Standard catalogues many of the ways that errors can come about. Paying attention to commonly misused APIs can be a quick path to success.

Once a vulnerability is identified it should be used to compromise the integrity of the program, however, there are a variety of ways to achieve this goal. For programs such as web servers, obtaining the information from another user may be the end goal. In others, changing your permissions may be helpful, for example changing the permissions of a local user to the administrator.

Lecture The first lecture, Memory Corruption 101, provides background and step-by-step explanation of exploiting an overflow on Windows. The second lecture, Memory Corruption 102, covers more advanced topics, including web browser exploitation. Both of these lectures use Windows-specific examples but the techniques and process are applicable across operating systems that use the x86 instruction set. Remember that the names of functions and sometimes the calling conventions will be different when you are working with UNIX/Linux binaries.

Memory Corruption 101 Memory Corruption 102 Tools We recommend using GDB to debug the challenges in this module since all of them are compiled for 32-bit Linux, however, GDB is intended for debugging source code, not stripped binaries without symbols and debugging information. Projects such as GEF, peda, and voltron are attempts at making gdb more useful for debugging when source code is not available. We recommend making a .gdbinit file in your home directory with at least the following commands:

   set disassembly-flavor intel
   set follow-fork-mode child

Workshop In order to run these challenges, you'll need to setup an Ubuntu 14.04 (32-bit) virtual machine. We recommend using VMware Player, since it's free and well supported. When you have it running, open a terminal and install socat with command sudo apt-get install socat.

There are three challenges in this workshop, each contained within this folder when you clone this repository in git. The ultimate goal of each challenge is to manipulate the executable into reading the flag to you. For each challenge, try to translate the disassembly into C code. After making an attempt, you can verify your guess by checking the actual C source provided. Then, try to exploit it to read you the flag.

Challenge: Easy Make sure the flag is in the same directory as the easy program. Once you execute easy it will listen for instructions on port 12346.

Challenge: Social Similar to easy, make sure the flag and host.sh are in the same directory as social program. Once you execute social it will listen for instructions on port 12347.

Resources Using GDB to Develop Exploits - A Basic Run Through Exploiting Format String Vulnerabilities Low-level Software Security: Attacks and Defenses

Exploiting Binaries 2 In this module, we continue to examine the ways that native applications can be exploited and focus on using return-oriented programming (ROP) to achieve that goal. ROP is the process of stitching together existing executable fragments of code ending in a return instruction. By creating chains of addresses of these ‘gadgets’ one can write new programs without introducing any new code.

Keep in mind that you will need to be flexible in identifying methods to exploit programs. Sometimes it’s necessary to abuse a vulnerability multiple times in the course of an exploit. At times, you may only want to use a ROP bridge to make your shellcode executable and, at others, you may want to use a payload written entirely in ROP. Occasionally, the layout of memory makes unorthodox methods of exploitation favorable. For example, have you considered manufacturing an uncontrolled format string vulnerability using ROP?

Lecture The lectures this week will discuss return oriented programming (ROP) and code reuse to bypass protections that disallow the execution of attacker-provided data. These lectures go into much greater detail on exploitation and build upon some of what was discussed last week.

Return Oriented Exploitation Payload already inside data re-use for ROP exploits (Linux-specific) Workshop Similar to the previous lesson, there are two executable files located in this folder when you clone the repository. Exploits for each of these programs will require the use of return-oriented programming to read the flags. This week, there is no access to source code provided. You will need to reverse engineer the binaries to discovery the vulnerabilities and the techniques required to exploit them. Use the same Ubuntu 14.04 (32-bit) virtual machine as the previous lesson.

Challenge: brute_cookie Run the bc program. It will listen on port 12345.

Challenge: space Run host.sh in the same directory as the space program. It will listen on port 12348.

Challenge: rop_mixer Run host.sh in the same directory as the rop_mixer program. It will listen on port 12349.

Tools Refer to the tools from last week. If you haven't already, you should become familiar with the binutils suite for *NIX. Tools like readelf, strings, objdump, objcopy, and nm are frequently helpful. Use your package manager and the manpages to install and read about their use.

Several tools exist to aid specifically in the creation of exploits that reuse code. They are more specialized than a standard disassembler because they look for executable snippets of code suitable as ROP gadgets (between instructions, in .rodata, etc).

RP ROPGadget BISC - Simple tool covered in the lecture Resources x86-64 buffer overflow exploits and the borrowed code chunks exploitation technique Surgically returning to randomized lib(c) Extensive security reference Dartmouth College: Useful Security and Privacy links Corelan Team Blog

Web Exploitation This module follows up on the previous auditing web applications module. In this module we will focus on exploiting those vulnerabilities. By the end of this module you should be comfortable identifying and exploiting the OWASP Top 10.

Lecture We covered the basics in the previous section on web security, so now we can dive into some more capable tools to achieve greater effects in this module. Learn to master Burp Suite and the Chrome Developer tools to gain a greater understanding of the applications you interact with. BeEF is an example of an XSS proxy and it will pay off to look through its source code and learn how it works.

Burp Suite Training Chrome Dev Tools From XSS to reverse shell with BeEF Workshop You have been tasked with auditing Gruyere, a small, cheesy web application. Gruyere is available through and hosted by Google. It includes exercises for exploiting many classes of web-specific vulnerabilities including XSS, SQL injection, CSRF, directory traversal and more. For each challenge you can find hints, exploits and methods to patch the vulnerable code.

References Google Chrome Console OWASP Top 10 Tools and Tactics The Tangled Web: Chapter 3 PHP Primer Tools SQL Map and BeEF are powerful tools and very fun to play around with but ultimately not needed for the exercise. If you insist on playing with BeEF on, then please try not to hook other users auditing the application.

Burp Suite SQL Map The Browser Exploitation Framework (BeEF)

Forensics

In a CTF context, "Forensics" challenges can include file format analysis, steganography, memory dump analysis, or network packet capture analysis. Any challenge to examine and process a hidden piece of information out of static data files (as opposed to executable programs or remote servers) could be considered a Forensics challenge (unless it involves cryptography, in which case it probably belongs in the Crypto category).

Forensics is a broad CTF category that does not map well to any particular job role in the security industry, although some challenges model the kinds of tasks seen in Incident Response (IR). Even in IR work, computer forensics is usually the domain of law enforcement seeking evidentiary data and attribution, rather than the commercial incident responder who may just be interested in expelling an attacker and/or restoring system integrity.

Unlike most CTF forensics challenges, a real-world computer forensics task would hardly ever involve unraveling a scheme of cleverly encoded bytes, hidden data, mastroshka-like files-within-files, or other such brain-teaser puzzles. One would typically not bust a criminal case by carefully reassembling a corrupted PNG file, revealing a photo of a QR code that decodes to a password for a zip archive containing an NES rom that when played will output the confession. Rather, real-world forensics typically requires that a practictioner find indirect evidence of maliciousness: either the traces of an attacker on a system, or the traces of "insider threat" behavior. Real-world computer forensics is largely about knowing where to find incriminating clues in logs, in memory, in filesystems/registries, and associated file and filesystem metadata. Also, network (packet capture) forensics is more about metadata analysis than content analysis, as most network sessions are TLS-encrypted between endpoints now.

This disconnect between the somewhat artificial puzzle-game CTF "Forensics" and the way that forensics is actually done in the field might be why this category does not receive as much attention as the vulnerability-exploitation style challenges. It may also lack the "black hat attacker" appeal that draws many players to participate in CTFs. Regardless, many players enjoy the variety and novelty in CTF forensics challenges. It can also be a more beginner friendly category, in which the playing field is evened out by the fact that there are no $5,000 professional tools like IDA Pro Ultimate Edition with Hex-Rays Decompiler that would give a huge advantage to some players but not others, as is the case with executable analysis challenges.

Requisite Skills For solving forensics CTF challenges, the three most useful abilities are probably:

Knowing a scripting language (e.g., Python) Knowing how to manipulate binary data (byte-level manipulations) in that language Recognizing formats, protocols, structures, and encodings The first and second you can learn and practice outside of a CTF, but the third may only come from experience. Hopefully with this document, you can at least get a good headstart.

And of course, like most CTF play, the ideal environment is a Linux system with – occasionally – Windows in a VM. MacOS is not a bad environment to substitute for Linux, if you can accept that some open-source tools may not install or compile correctly.

Manipulating Binary Data in Python Assuming you have already picked up some Python programming, you still may not know how to effectively work with binary data. Low-level languages like C might be more naturally suited for this task, but Python's many useful packages from the open-source community outweigh its learning curve for working with binary data.

Here are some examples of working with binary data in Python.

Writing or reading a file in binary mode:

f = open('Reverseit', "rb") s = f.read() f.close() f = open('ItsReversed', "wb") f.write(s[::-1]) f.close() The bytearray type is a mutable sequence of bytes, and is available in both Python 2 and 3:

>>> s = bytearray(b"Hello World") >>> for c in s: print(c) ... 72 101 108 108 111 32 87 111 114 108 100 >>> You can also define a bytearray from hexidecimal representation Unicode strings:

example2 = bytearray.fromhex(u'00 ff') >>> example2 bytearray(b'\x00\xff') >>> example2[1] 255 The bytearray type has most of the same convenient methods as a Python str or list: split(), insert(), reverse(), extend(), pop(), remove(), etc.

Reading a file into a bytearray for processing:

data = bytearray(open('challenge.png', 'rb').read()) Common Forensics Concepts and Tools What follows is a high-level overview of some of the common concepts in forensics CTF challenges, and some recommended tools for performing common tasks.

File format identification (and "magic bytes") Almost every forensics challenge will involve a file, usually without any context that would give you a guess as to what the file is. Filetypes, as a concept for users, have historically been indicated either with filetype extensions (e.g., readme.md for MarkDown), MIME types (as on the web, with Content-Type headers), or with metadata stored in the filesystem (as with the mdls command in MacOS). In a CTF, part of the game is to identify the file ourselves, using a heuristic approach.

The traditional heuristic for identifying filetypes on UNIX is libmagic, which is a library for identifying so-called "magic numbers" or "magic bytes," the unique identifying marker bytes in filetype headers. The libmagic libary is the basis for the file command.

$ file screenshot.png screenshot.png: PNG image data, 1920 x 1080, 8-bit/color RGBA, non-interlaced Keep in mind that heuristics, and tools that employ them, can be easily fooled. Because it is a CTF, you may be presented with a file that has been intentionally crafted to mislead file. Also, if a file contains another file embedded somewhere inside it, the file command is only going to identify the containing filetype. In scenarios such as these you may need to examine the file content more closely.

TrID is a more sophisticated version of file. Although it's closed-source, it's free and works across platforms. It also uses an identification heuristic, but with certainty percentages. Its advantage is its larger set of known filetypes that include a lot of proprietary and obscure formats seen in the real world.

File carving Files-within-files is a common trope in forensics CTF challenges, and also in embedded systems' firmware where primitive or flat filesystems are common. The term for identifying a file embedded in another file and extracting it is "file carving." One of the best tools for this task is the firmware analysis tool binwalk.

scalpel, now a part of SleuthKit (discussed further under Filesystems) is another tool for file-carving, formerly known as Foremost.

To manually extract a sub-section of a file (from a known offset to a known offset), you can use the dd command. Many hex-editors also offer the ability to copy bytes and paste them as a new file, so you don't need to study the offsets.

Example of file-carving with dd from an file-offset of 1335205 for a length of 40668937 bytes:

$ dd if=./file_with_a_file_in_it.xxx of=./extracted_file.xxx bs=1 skip=1335205 count=40668937 Although the above tools should suffice, in some cases you may need to programmatically extract a sub-section of a file using Python, using things like Python's re or regex modules to identify magic bytes, and the zlib module to extract zlib streams.

Initial analysis At first you may not have any leads, and need to explore the challenge file at a high-level for a clue toward what to look at next. Some of the useful commands to know are strings to search for all plain-text strings in the file, grep to search for particular strings, bgrep to search for non-text data patterns, and hexdump.

Example of using strings to find ASCII strings, with file offsets:

$ strings -o screenshot.png

    12 IHDR
    36 $iCCPICC Profile
    88 U2EI4HB

...

    767787 IEND

Unicode strings, if they are UTF-8, might show up in the search for ASCII strings. But to search for other encodings, see the documentation for the -e flag. Beware the many encoding pitfalls of strings: some caution against its use in forensics at all, but for simple tasks it still has its place.

Example of searching for the PNG magic bytes in a PNG file:

$ bgrep 89504e47 screenshot.png screenshot.png: 00000000 Example of using hexdump:

The advantage of hexdump is not that it is the best hex-editor (it's not), but that you can pipe output of other commands directly into hexdump, and/or pipe its output to grep, or format its output using format strings.

Example of using hexdump format strings to output the first 50 bytes of a file as a series of 64-bit integers in hex:

$ hexdump -n 50 -e '"0x%08x "' screenshot.png 0x474e5089 0x0a1a0a0d 0x0d000000 0x52444849 0xca050000 0x88020000 0x00000608 0xc93d4000 0x180000a4 0x43436924 0x43434950 0x6f725020 0x00006966 Other uses of the hexdump command.

Binary-as-text encodings Binary is 1's and 0's, but often is transmitted as text. It would be wasteful to transmit actual sequences of 101010101, so the data is first encoded using one of a variety of methods. This is what is referred to as binary-to-text encoding, a popular trope in CTF challenges. When doing a strings analysis of a file as discussed above, you may uncover this binary data encoded as text strings.

We mentioned that to excel at forensics CTF challenges, it is important to be able to recognize encodings. Some can be identifed at a glance, such as Base64 encoded content, identifiable by its alphanumeric charset and its "=" padding suffix (when present). There are many Base64 encoder/decoders online, or you can use the base64 command:

$ echo aGVsbG8gd29ybGQh | base64 -D hello world! ASCII-encoded hexadecimal is also identifiable by its charset (0-9, A-F). ASCII characters themselves occupy a certain range of bytes (0x00 through 0x7f, see man ascii), so if you are examining a file and find a string like 68 65 6c 6c 6f 20 77 6f 72 6c 64 21, it's important to notice the preponderance of 0x60's here: this is ASCII. Technically, it's text ("hello world!") encoded as ASCII (binary) encoded as hexadecimal (text again). Confused yet? 😉

There are several sites that provide online encoder-decoders for a variety of encodings. For a more local converter, try the xxd command.

Example of using xxd to do text-as-ascii-to-hex encoding:

$ echo hello world\! | xxd -p 68656c6c6f20776f726c64210a Common File formats We've discussed the fundamental concepts and the tools for the more generic forensics tasks. Now, we'll discuss more specific categories of forensics challenges, and the recommended tools for analyzing challenges in each category.

It would be impossible to prepare for every possible data format, but there are some that are especially popular in CTFs. If you were prepared with tools for analyzing the following, you would be prepared for the majority of Forensics challenges:

Archive files (ZIP, TGZ) Image file formats (JPG, GIF, BMP, PNG) Filesystem images (especially EXT4) Packet captures (PCAP, PCAPNG) Memory dumps PDF Video (especially MP4) or Audio (especially WAV, MP3) Microsoft's Office formats (RTF, OLE, OOXML) Some of the harder CTF challenges pride themselves on requiring players to analyze an especially obscure format for which no publicly available tools exist. You will need to learn to quickly locate documentation and tools for unfamiliar formats. Many file formats are well-described in the public documentation you can find with a web search, but having some familiarity with the file format specifications will also help, so we include links to those here.

When analyzing file formats, a file-format-aware (a.k.a. templated) hex-editor like 010 Editor is invaluable. An open-source alternative has emerged called Kaitai. Additionally, a lesser-known feature of the Wireshark network protocol analyzer is its ability to analyze certain media file formats like GIF, JPG, and PNG. All of these tools, however, are made to analyze non-corrupted and well-formatted files. Many CTF challenges task you with reconstructing a file based on missing or zeroed-out format fields, etc.

You also ought to check out the wonderful file-formats illustrated visually by Ange Albertini.

Archive files Most CTF challenges are contained in a zip, 7z, rar, tar or tgz file, but only in a forensics challenge will the archive container file be a part of the challenge itself. Usually the goal here is to extract a file from a damaged archive, or find data embedded somewhere in an unused field (a common forensics challenge). Zip is the most common in the real world, and the most common in CTFs.

There are a handful of command-line tools for zip files that will be useful to know about.

unzip will often output helpful information on why a zip will not decompress. zipdetails -v will provide in-depth information on the values present in the various fields of the format. zipinfo lists information about the zip file's contents, without extracting it. zip -F input.zip --out output.zip and zip -FF input.zip --out output.zip attempt to repair a corrupted zip file. fcrackzip brute-force guesses a zip password (for passwords <7 characters or so). Zip file format specification

One important security-related note about password-protected zip files is that they do not encrypt the filenames and original file sizes of the compressed files they contain, unlike password-protected RAR or 7z files.

Another note about zip cracking is that if you have an unencrypted/uncompressed copy of any one of the files that is compressed in the encrypted zip, you can perform a "plaintext attack" and crack the zip, as detailed here, and explained in this paper. The newer scheme for password-protecting zip files (with AES-256, rather than "ZipCrypto") does not have this weakness.

Image file format analysis CTFs are supposed to be fun, and image files are good for containing hacker memes, so of course image files often appear in CTF challenges. Image file formats are complex and can be abused in many ways that make for interesting analysis puzzles involving metadata fields, lossy and lossless compression, checksums, steganography, or visual data encoding schemes.

The easy initial analysis step is to check an image file's metadata fields with exiftool. If an image file has been abused for a CTF, its EXIF might identify the original image dimensions, camera type, embedded thumbnail image, comments and copyright strings, GPS location coordinates, etc. There might be a gold mine of metadata, or there might be almost nothing. It's worth a look.

Example of exiftool output, truncated:

$ exiftool screenshot.png ExifTool Version Number : 10.53 File Name : screenshot.png Directory : . File Size : 750 kB File Modification Date/Time : 2017:06:13 22:34:05-04:00 File Access Date/Time : 2017:06:17 13:19:58-04:00 File Inode Change Date/Time : 2017:06:13 22:34:05-04:00 File Permissions : rw-r--r-- File Type : PNG File Type Extension : png MIME Type : image/png Image Width : 1482 Image Height : 648 Bit Depth : 8 Color Type : RGB with Alpha Compression : Deflate/Inflate ... Primary Platform : Apple Computer Inc. CMM Flags : Not Embedded, Independent Device Manufacturer : APPL Device Model : ... Exif Image Width : 1482 Exif Image Height : 648 Image Size : 1482x648 Megapixels : 0.960 PNG files, in particular, are popular in CTF challenges, probably for their lossless compression suitable for hiding non-visual data in the image. PNG files can be dissected in Wireshark. To verify correcteness or attempt to repair corrupted PNGs you can use pngcheck. If you need to dig into PNG a little deeper, the pngtools package might be useful.

Steganography, the practice of concealing some amount of secret data within an unrelated data as its vessel (a.k.a. the "cover text"), is extraordinarily rare in the real world (made effectively obsolete by strong cryptography), but is another popular trope in CTF forensics challenges. Steganography could be implemented using any kind of data as the "cover text," but media file formats are ideal because they tolerate a certain amount of unnoticeable data loss (the same characteristic that makes lossy compression schemes possible). The difficulty with steganography is that extracting the hidden message requires not only a detection that steganography has been used, but also the exact steganographic tool used to embed it. Given a challenge file, if we suspect steganography, we must do at least a little guessing to check if it's present. Stegsolve (JAR download link) is often used to apply various steganography techniques to image files in an attempt to detect and extract hidden data. You may also try zsteg.

Gimp provides the ability to alter various aspects of the visual data of an image file. CTF challenge authors have historically used altered Hue/Saturation/Luminance values or color channels to hide a secret message. Gimp is also good for confirming whether something really is an image file: for instance, when you believe you have recovered image data from a display buffer in a memory dump or elsewhere, but you lack the image file header that specifies pixel format, image height and width and so on. Open your mystery data as "raw image data" in Gimp and experiment with different settings.

The ImageMagick toolset can be incorporated into scripts and enable you to quickly identify, resize, crop, modify, convert, and otherwise manipulate image files. It can also find the visual and data difference between two seemingly identical images with its compare tool.

If you are writing a custom image file format parser, import the Python Image Library (PIL) aka Pillow. It enables you to extract frames from animated GIFs or even individual pixels from a JPG – it has native support for most major image file formats.

If working with QR codes (2D barcodes), also check out the qrtools module for Python. You can decode an image of a QR code with less than 5 lines of Python. Of course, if you just need to decode one QR code, any smartphone will do.

Filesystems analysis Occasionally, a CTF forensics challenge consists of a full disk image, and the player needs to have a strategy for finding a needle (the flag) in this haystack of data. Triage, in computer forensics, refers to the ability to quickly narrow down what to look at. Without a strategy, the only option is looking at everything, which is time-prohibitive (not to mention exhausting).

Example of mounting a CD-ROM filesystem image:

mkdir /mnt/challenge mount -t iso9660 challengefile /mnt/challenge Once you have mounted the filesystem, the tree command is not bad for a quick look at the directory structure to see if anything sticks out to you for further analysis.

You may not be looking for a file in the visible filesystem at all, but rather a hidden volume, unallocated space (disk space that is not a part of any partition), a deleted file, or a non-file filesystem structure like an http://www.nirsoft.net/utils/alternate_data_streams.html. For EXT3 and EXT4 filesystems, you can attempt to find deleted files with extundelete. For everything else, there's TestDisk: recover missing partition tables, fix corrupted ones, undelete files on FAT or NTFS, etc.

The Sleuth Kit and its accompanying web-based user interface, "Autopsy," is a powerful open-source toolkit for filesystem analysis. It's a bit geared toward law-enforcement tasks, but can be helpful for tasks like searching for a keyword across the entire disk image, or looking at the unallocated space.

Embedded device filesystems are a unique category of their own. Made for fixed-function low-resource environments, they can be compressed, single-file, or read-only. Squashfs is one popular implementation of an embedded device filesystem. For images of embedded devices, you're better off analyzing them with firmware-mod-kit or binwalk.

Packet Capture (PCAP) file analysis Network traffic is stored and captured in a PCAP file (Packet capture), with a program like tcpdump or Wireshark (both based on libpcap). A popular CTF challenge is to provide a PCAP file representing some network traffic and challenge the player to recover/reconstitute a transferred file or transmitted secret. Complicating matters, the packets of interest are usually in an ocean of unrelated traffic, so analysis triage and filtering the data is also a job for the player.

For initial analysis, take a high-level view of the packets with Wireshark's statistics or conversations view, or its capinfos command. Wireshark, and its command-line version tshark, both support the concept of using "filters," which, if you master the syntax, can quickly reduce the scope of your analysis. There is also an online service called PacketTotal where you can submit PCAP files up to 50MB, and graphically display some timelines of connections, and SSL metadata on the secure connections. Plus it will highlight file transfers and show you any "suspicious" activity. If you already know what you're searching for, you can do grep-style searching through packets using ngrep.

Just as "file carving" refers to the identification and extraction of files embedded within files, "packet carving" is a term sometimes used to describe the extraction of files from a packet capture. There are expensive commercial tools for recovering files from captured packets, but one open-source alternative is the Xplico framework. Wireshark also has an "Export Objects" feature to extract data from the capture (e.g., File -> Export Objects -> HTTP -> Save all). Beyond that, you can try tcpxtract, Network Miner, Foremost, or Snort.

If you want to write your own scripts to process PCAP files directly, the dpkt Python package for pcap manipulation is recommended. You could also interface Wireshark from your Python using Wirepy.

If trying to repair a damaged PCAP file, there is an online service for repairing PCAP files called PCAPfix.

A note about PCAP vs PCAPNG: there are two versions of the PCAP file format; PCAPNG is newer and not supported by all tools. You may need to convert a file from PCAPNG to PCAP using Wireshark or another compatible tool, in order to work with it in some other tools.

Memory dump analysis For years, computer forensics was synonymous with filesystem forensics, but as attackers became more sophisticated, they started to avoid the disk. Also, a snapshot of memory often contains context and clues that are impossible to find on disk because they only exist at runtime (operational configurations, remote-exploit shellcode, passwords and encryption keys, etc). So memory snapshot / memory dump forensics has become a popular practice in incident response. In a CTF, you might find a challenge that provides a memory dump image, and tasks you with locating and extracting a secret or a file from within it.

The premiere open-source framework for memory dump analysis is Volatility. Volatility is a Python script for parsing memory dumps that were gathered with an external tool (or a VMware memory image gathered by pausing the VM). So, given the memory dump file and the relevant "profile" (the OS from which the dump was gathered), Volatility can start identifying the structures in the data: running processes, passwords, etc. It is also extensible using plugins for extracting various types of artifact.

Ethscan is made to find data in a memory dump that looks like network packets, and then extract it into a pcap file for viewing in Wireshark. There are plugins for extracting SQL databases, Chrome history, Firefox history and much more.

PDF file analysis PDF is an extremely complicated document file format, with enough tricks and hiding places to write about for years. This also makes it popular for CTF forensics challenges. The NSA wrote a guide to these hiding places in 2008 titled "Hidden Data and Metadata in Adobe PDF Files: Publication Risks and Countermeasures." It's no longer available at its original URL, but you can find a copy here. Ange Albertini also keeps a wiki on GitHub of PDF file format tricks.

The PDF format is partially plain-text, like HTML, but with many binary "objects" in the contents. Didier Stevens has written good introductory material about the format. The binary objects can be compressed or even encrypted data, and include content in scripting languages like JavaScript or Flash. To display the structure of a PDF, you can either browse it with a text editor, or open it with a PDF-aware file-format editor like Origami.

qpdf is one tool that can be useful for exploring a PDF and transforming or extracting information from it. Another is a framework in Ruby called Origami.

When exploring PDF content for hidden data, some of the hiding places to check include:

non-visible layers Adobe's metadata format "XMP" the "incremental generation" feature of PDF wherein a previous version is retained but not visible to the user white text on a white background text behind images an image behind an overlapping image non-displayed comments There are also several Python packages for working with the PDF file format, like PeepDF, that enable you to write your own parsing scripts.

Video and Audio file analysis Like image file formats, audio and video file trickery is a common theme in CTF forensics challenges not because hacking or data hiding ever happens this way in the real world, but just because audio and video is fun. As with image file formats, stegonagraphy might be used to embed a secret message in the content data, and again you should know to check the file metadata areas for clues. Your first step should be to take a look with the mediainfo tool (or exiftool) and identify the content type and look at its metadata.

Audacity is the premiere open-source audio file and waveform-viewing tool, and CTF challenge authors love to encode text into audio waveforms, which you can see using the spectogram view (although a specialized tool called Sonic Visualiser is better for this task in particular). Audacity can also enable you to slow down, reverse, and do other manipulations that might reveal a hidden message if you suspect there is one (if you can hear garbled audio, interference, or static). Sox is another useful command-line tool for converting and manipulating audio files.

It's also common to check least-significant-bits (LSB) for a secret message. Most audio and video media formats use discrete (fixed-size) "chunks" so that they can be streamed; the LSBs of those chunks are a common place to smuggle some data without visibly affecting the file.

Other times, a message might be encoded into the audio as DTMF tones or morse code. For these, try working with multimon-ng to decode them.

Video file formats are really container formats, that contain separate streams of both audio and video that are multiplexed together for playback. For analyzing and manipulating video file formats, ffmpeg is recommended. ffmpeg -i gives initial analysis of the file content. It can also de-multiplex or playback the content streams. The power of ffmpeg is exposed to Python using ffmpy.

Office file analysis Microsoft has created dozens of office document file formats, many of which are popular for the distribution of phishing attacks and malware because of their ability to include macros (VBA scripts). Microsoft Office document forensic analysis is not too different from PDF document forensics, and just as relevant to real-world incident response.

Broadly speaking, there are two generations of Office file format: the OLE formats (file extensions like RTF, DOC, XLS, PPT), and the "Office Open XML" formats (file extensions that include DOCX, XLSX, PPTX). Both formats are structured, compound file binary formats that enable Linked or Embedded content (Objects). OOXML files are actually zip file containers (see the section above on archive files), meaning that one of the easiest ways to check for hidden data is to simply unzip the document:

$ unzip example.docx Archive: example.docx

 inflating: [Content_Types].xml     
 inflating: _rels/.rels             
 inflating: word/_rels/document.xml.rels  
 inflating: word/document.xml       
 inflating: word/theme/theme1.xml   
extracting: docProps/thumbnail.jpeg  
 inflating: word/comments.xml       
 inflating: word/settings.xml       
 inflating: word/fontTable.xml      
 inflating: word/styles.xml         
 inflating: word/stylesWithEffects.xml  
 inflating: docProps/app.xml        
 inflating: docProps/core.xml       
 inflating: word/webSettings.xml    
 inflating: word/numbering.xml

$ tree . ├── [Content_Types].xml ├── _rels ├── docProps │ ├── app.xml │ ├── core.xml │ └── thumbnail.jpeg └── word

   ├── _rels
   │   └── document.xml.rels
   ├── comments.xml
   ├── document.xml
   ├── fontTable.xml
   ├── numbering.xml
   ├── settings.xml
   ├── styles.xml
   ├── stylesWithEffects.xml
   ├── theme
   │   └── theme1.xml
   └── webSettings.xml

As you can see, some of the structure is created by the file and folder hierarchy. The rest is specified inside the XML files. New Steganographic Techniques for the OOXML File Format, 2011 details some ideas for data hiding techniques, but CTF challenge authors will always be coming up with new ones.

Once again, a Python toolset exists for the examination and analysis of OLE and OOXML documents: oletools. For OOXML documents in particular, OfficeDissector is a very powerful analysis framework (and Python library). The latter includes a quick guide to its usage.

Sometimes the challenge is not to find hidden static data, but to analyze a VBA macro to determine its behavior. This is a more realistic scenario, and one that analysts in the field perform every day. The aforementioned dissector tools can indicate whether a macro is present, and probably extract it for you. A typical VBA macro in an Office document, on Windows, will download a PowerShell script to %TEMP% and attempt to execute it, in which case you now have a PowerShell script analysis task too. But malicious VBA macros are rarely complicated, since VBA is typically just used as a jumping-off platform to bootstrap code execution. In the case where you do need to understand a complicated VBA macro, or if the macro is obfuscated and has an unpacker routine, you don't need to own a license to Microsoft Office to debug this. You can use Libre Office: its interface will be familiar to anyone who has debugged a program; you can set breakpoints and create watch variables and capture values after they have been unpacked but before whatever payload behavior has executed. You can even start a macro of a specific document from a command line:

$ soffice path/to/test.docx macro://./standard.module1.mymacro

Toolkit Creation

Welcome to the module on toolkit creation. A toolkit is a set of utilities that enable you and your team to achieve operational goals in the most efficient manner possible. Your toolkit is a force multiplier that will enable you to minimize the time you spend developing exploits during the game and maximize the return on your development time.

A good toolkit is well rounded and easy to use. You should incorporate software that allows members of your team to communicate effectively, work collaboratively, automate common tasks and provide situational awareness of the game as it plays out.

Lecture Creating a SOC Stealth Rootkit Development Toolsmithing Case Study Organizing and Participating in CTF RTFn Workshop Create three lists. Populate the first list with the functionality your ideal toolkit would provide. Populate the second list with software that can provide that functionality. Use the third list to rank in order of importance functionality that is inadequately supported by the software from list two. Begin developing software that fills in the gaps of your ideal toolkit.

Some functionality you should not neglect:

Management of exploitation, key aggregation and submission. Stealthy and secure payloads or persistence methods. Secure communication and collaboration. Network/Host situational awareness. Resources Meterpreter Functionality Outline IDA Python Overview, IDA Python Download NASM Documentation Pyershark Code you might find useful from the Pwnies (Python) and Ronin (Ruby)

Operational Tradecraft

Studies in Tradecraft Operational tradecraft is generally cultivated with a specific goal in mind. While playing competitive wargames you will most likely be focused on evading detection and not putting elements of your toolkit (infrastructure, exploits) at risk of inadvertent exposure.

Lecture Post-Exploitation and Operational Security A Brief History of CTF and Tradecraft Operational Use of Offensive Cyber Workshop Evaluate the operational tradecraft displayed during the following campaigns. Each design decision employed in these tools and campaigns has an operational philosophy behind it.

Some things to think about while evaluating tradecraft:

Why did the actor chose to perform/implement X action/capability? Were any mistakes made? Was a decision flawed or shortsighted in some way? Was an action/capability anomalus? Does it fit with the rest of the operational philosophy? What was the actor most interested in protecting? (ex: Tools, Identities, Employers etc.) What can be learned from each campaign from an attackers standpoint? Defenders standpoint? Campaigns APT1 APT28 StuxNet Careto Resources These are few public examples, groups, or organizations that discuss their own tradecraft. The AMA's below provide a rare glimpse into how extraordinarily talented groups operate.

Plaid Parlement Of Pwning AMA Samurai AMA Tradecraft Lectures 8 & 9

Referensi

https://trailofbits.github.io/ctf/

Difference between revisions of "CTF: Field Guide"

Revision as of 10:13, 5 February 2023

Contents

CTF Field Guide

Welcome!

Capture the Flag

Why CTF?

Find a CTF

How is a Wargame different?

What about CCDC?

Find a Job

Capture the Flag

Communication

Meet People

Conferences

Information Security Conferences Calendar

Certifications

Vulnerability Discovery

Lecture

Workshop

Exploit Creation

Forensics

Toolkit Creation

Operational Tradecraft

Referensi

Navigation menu

Search

@@ Line 1: / Line 1: @@
 Sumber: https://trailofbits.github.io/ctf/
-CTF Field Guide
+==CTF Field Guide==
 “Knowing is not enough; we must apply. Willing is not enough; we must do.” - Johann Wolfgang von Goethe
-Welcome!
+==Welcome!==
 We’re glad you’re here. We need more people like you.
@@ Line 36: / Line 38: @@
 ==Capture the Flag==
-Why CTF?
+===Why CTF?===
 Computer security represents a challenge to education due to its interdisciplinary nature. Topics in computer security are drawn from areas ranging from theoretical aspects of computer science to applied aspects of information technology management. This makes it difficult to encapsulate the spirit of what constitutes a computer security professional.
@@ Line 49: / Line 52: @@
 If you ever wanted to start running, you were probably encouraged to sign up to a 5k to keep focused on a goal. The same principle applies here: pick a CTF in the near future that you want to compete in and come up with a practice schedule. Here are some CTFs that we can recommend:
-PicoCTF and PlaidCTF by CMU
+* PicoCTF and PlaidCTF by CMU
-HSCTF is made for high school students
+* HSCTF is made for high school students
-Ghost in the Shellcode (GitS)
+* Ghost in the Shellcode (GitS)
-CSAW CTF by NYU-Poly
+* CSAW CTF by NYU-Poly
-UCSB iCTF is for academics only
+* UCSB iCTF is for academics only
-Defcon CTF
+* Defcon CTF
 Visit CTF Time and the CapCTF calendar for a more complete list of CTFs occuring every week of the year.
-How is a Wargame different?
+==How is a Wargame different?==
 Wargames are similar to a CTF but are always ongoing. Typically, they are organized into levels that get progressively harder as you solve more of them. Wargames are an excellent way to practice for CTF! Here are some of our favorites:
-Micro Corruption
+* Micro Corruption
-SmashTheStack
+* SmashTheStack
-OverTheWire
+* OverTheWire
-Exploit Exercises
+* Exploit Exercises
-What about CCDC?
+==What about CCDC?==
 There are some defense-only competitions that disguise themselves as CTF competitions, mainly the Collegiate Cyber Defense Challenge (CCDC) and its regional variations, and our opinion is that you should avoid them. They are unrealistic exercises in frustration and will teach you little about security or anything else. They are incredibly fun to play as a Red Team though!
@@ Line 148: / Line 153: @@
 Once in university, take classes that force you to write code in large volumes to solve hard problems. IMHO the courses that focus on mainly theoretical or simulated problems provide limited value. Ask upper level students for recommendations if you can't identify the CS courses with programming from the CS courses done entirely on paper. The other way to frame this is to go to school for software development rather than computer science.
-Capture the Flag
+==Capture the Flag==
 If you want to acquire and maintain technical skills and you want to do it fast, then you should play in a CTF or jump into a wargame. The one thing to note is that many of these challenges attach themselves to conferences (of all sizes), and by playing in them you will likely miss the entire rest of the conference. Try not to over do it, since conferences are useful in their own way (see the rest of the career guide).
 There are some defense-only competitions that disguise themselves as normal CTF competitions, mainly the Collegiate Cyber Defense Challenge (CCDC) and its regional variations, and my opinion is that you should avoid them. They are exercises in system administration and frustration and will teach you little about security or anything else. They are incredibly fun to play as a Red Team though.
-Communication
+==Communication==
 In any role, the majority of your time will be spent communicating with others, primarily through email and meetings and less by phone and IM. The role/employer you have will determine whether you speak more with internal infosec teams, non-security technologists, or business users. For example, expect to communicate more with external technologists if you do network security for a financial firm.
 Tips for communicating well in a large organization:
-Learn to write clear, concise, and professional email.
+* Learn to write clear, concise, and professional email.
-Learn to get things done and stay organized. Do not drop the ball.
+* Learn to get things done and stay organized. Do not drop the ball.
-Learn the business that your company or client is in. If you can speak in terms of the business, your arguments a) to not do things b) to fix things and c) to do things that involve time and money will be much more persuasive.
+* Learn the business that your company or client is in. If you can speak in terms of the business, your arguments a) to not do things b) to fix things and c) to do things that involve time and money will be much more persuasive.
-Learn how your company or client works, ie. key individuals, processes, or other motivators that factor into what gets things done.
+* Learn how your company or client works, ie. key individuals, processes, or other motivators that factor into what gets things done.
 If you are still attending a university, as with CS courses, take humanities courses that force you to write.
-Meet People
+==Meet People==
 Find and go to your local CitySec, an informal meetup without presentations that occurs once monthly in most cities. At Trail of Bits, we attend our local NYSEC.
 ISSA and ISC2 focus on policy, compliance and other issues that will be of uncertain use for a new student in this field. Similarly, InfraGard mainly focuses on non-technical law enforcement-related issues. OWASP is one of the industry's worst examples of vendor capture and is less about technology and more about sales.
-Conferences
+==Conferences==
 If you've never been to an infosec conference before, use the google calendar below to find a low-cost local one and go. There have been students of mine who think that attending a conference will be some kind of test and put off going to one for as long as possible. I promise I won't pop out of the bushes with a final exam and publish your scores afterward.
-Information Security Conferences Calendar
+==Information Security Conferences Calendar==
 If you go to a conference, don't obsess over attending a talk during every time slot. The talks are just bait to lure all the smart hackers to one location for a weekend: you should meet the other attendees! If a particular talk was interesting and useful then you can and should talk to the speaker. This post by Shawn Moyer at the Defcon Speaker's Corner has more on this subject.
 If you're working somewhere and are having trouble justifying conference attendance to your company, the Infosec Leaders blog has some helpful advice.
-Certifications
+==Certifications==
 This industry requires specialized knowledge and skills and studying for a certification exam will not help you gain them. In fact, in many cases, it can be harmful because the time you spend studying for a test will distract you from doing anything else in this guide.
@@ Line 270: / Line 281: @@
 This module is about becoming familiar with the native layer and developing strategies for understanding, analyzing and interpreting native code. By the end of this module you should be capable of performing a “reverse compilation” -- going from assembly fragments to statements in higher level languages -- and, in the process, deriving meaning and programmer intent.
-Lecture
+==Lecture==
 Learning x86 can appear daunting at first and requires some dedicated study to master. We recommend reading Chapter 3 of "Computer Systems: A Programmer's Perspective" to learn how C programs get compiled into machine code. Once you you have some basic, working knowledge of this process then keep a handy reference guide around like the x86 Assembly Guide from the University of Virginia. We've found this video series from Quinn Liu to be a quick and painless introduction too.
-CS:APP Chapter 3: Machine-Level Representation of Programs
+* CS:APP Chapter 3: Machine-Level Representation of Programs
-x86 Assembly Guide
+* x86 Assembly Guide
-Introduction to x86 Assembly
+* Introduction to x86 Assembly
-Workshop
+==Workshop==
 The following programs are both “binary bombs.” Reverse engineer the following linux programs to identify the series of inputs that will “defuse” the bomb. Each successive level of the bomb focuses on a different aspect of native code. For example, in the lab from CMU you will encounter different data structures (linked lists, trees) as well as how control flow structures (switches, loops) manifest in native code. While reversing these programs you may find it useful to keep track of your progress by transforming what you see into C or another high level language.