InfoSec Institute – CTF Level 12

This is a walkthrough of InfoSec Institute’s CTF challenge, Level 12.

As I mentioned in some of the other walkthroughs, the first step is to look through the source code for anything that’s out of place. After that, I typically evaluate the headers and other responses (with Chrome’s developer tools) and proceed from there. Anything that the site loads will be revealed in the “Network” tab, so it’s a pretty good source of information that’s always available.

In this level, the file “design.css” was out of place. Viewing the contents showed an invalid CSS statement:

This is not a color

In CSS, colors are typically specified with their hexidecimal value. (There are a couple of other acceptable formats, but that’s irrelevant for now)

Load that string into a Python interpreter, and use the built-in “decode” function. Pretty intuitive, yeah?

The flag also states the obvious.

InfoSec Institute – CTF Level 12

InfoSec Institute – CTF Level 11

This is a walkthrough for InfoSec Institute’s CTF Challenge, Level 11.

The only immediate difference between this and level 10 is the addition of a grainy PHP logo.

php-logo-virus

Grainy images are one indicator of steganography, so I proceeded along the route of checking for readable strings. Using the strings command again revealed the flag instantly (but read on!). However, opening it in emacs revealed that it was in the header of the image file.

Raw data

This is hardly the same as steganography, which hides the message in the image data. This flag is hidden in the image’s EXIF (Exchangable Image Format) data, which provides metadata about the image. If you have exiftool installed (apt-get install exiftool, IIRC), you can get the same information:

exiftool php-logo-virus.jpg | grep -i infosec

Exiftool Output

The “document name” field contains the string, plus two additional bytes. If you had trouble viewing the image properties, it was likely because the viewer wasn’t prepared for the extra bytes at the end of the string. Remember, the “strings” command only reveals printable characters.

Raw data

Depending on whether or not the flag includes the extra bytes, there are two options:

infosec_flagis_aHR0cDovL3d3dy5yb2xsZXJza2kuY28udWsvaW1hZ2VzYi9wb3dlcnNsaWRlX2xvZ29fbGFyZ2UuZ2lm

or

infosec_flagis_aHR0cDovL3d3dy5yb2xsZXJza2kuY28udWsvaW1hZ2VzYi9wb3dlcnNsaWRlX2xvZ29fbGFyZ2UuZ2lm\240\206^A

Bytes 240 and 206 are outside of the printable range, and not valid Unicode, as they’re missing the BOM. The control characters correspond to the end of the field, which is NUL-terminated. These characters are present because of the way emacs is forced to display something, but you can see the true value with a hex editor. I used hexedit in the following screenshot:

Hex Editor View

The additional unprintable characters are in red and NUL bytes in yellow.  Since the bytes are part of the field (as far as any EXIF parser is concerned), these bytes are part of the field. That leaves us with bytes A0 86 01 unaccounted for.

Note: Your raw data may differ from mine, due to endianness. If your hex editor displayed this, it’s correct, but you’ll notice each pair is switched.

          6e69 6f66 6573 5f63 6c66 6761
7369 615f 5248 6330 6f44 4c76 6433 6433
3579 6279 7832 5a73 4a58 617a 6b32 5975
3832 6475 7357 6176 3157 5a68 5632 597a
3969 6277 6433 636c 4e6e 6173 5257 586c
7832 5a76 3932 6266 4647 5a79 5532 5a75
6c32 a06d 0186 0000 0100

Before we cross over the threshold into text-encoding hell, extended ASCII sets, control characters, and all the levels dedicated specifically to Unicode, let’s take a step back. (Sorry guys, I only led you down this road to take a look at the actual contents of the field!)

Occam’s razor says the field was obfuscated to prevent what I’ll dub a “View Properties Attack”, and we’re dealing with a printable-characters-only string. This is a n00bs challenge, afterall.

Although it lacks the characteristic ending equals signs of a base64-encoded string, the flag we have does contain a valid base64 string. (The ending == signs are for padding, and not always present.)

Decoding that string yields the following URL:

http://www.rollerski.co.uk/imagesb/powerslide_logo_large.gif

Which links to the following image:

powerslide_logo_large

Answer(s)

  • infosec_flagis_\x61\x48\x52\x30\x63\x44\x6F\x76\x4C\x33\x64\x33\x64\x79\x35\x79\x62\x32\x78\x73\x5A\x58\x4A\x7A\x61\x32\x6B\x75\x59\x32\x38\x75\x64\x57\x73\x76\x61\x57\x31\x68\x5A\x32\x56\x7A\x59\x69\x39\x77\x62\x33\x64\x6C\x63\x6E\x4E\x73\x61\x57\x52\x6C\x58\x32\x78\x76\x5A\x32\x39\x66\x62\x47\x46\x79\x5A\x32\x55\x75\x5A\x32\x6C\x6D\xA0\x86\x01
  • infosec_flagis_aHR0cDovL3d3dy5yb2xsZXJza2kuY28udWsvaW1hZ2VzYi9wb3dlcnNsaWRlX2xvZ29fbGFyZ2UuZ2lm
  • infosec_flagis_http://www.rollerski.co.uk/imagesb/powerslide_logo_large.gif
  • infosec_flagis_powerslide

 

 

InfoSec Institute – CTF Level 11

InfoSec Institute – CTF Level 9

This is a walkthrough for InfoSec Institute’s CTF challenge, Level 9.

The challenge presents with a login screen for a Cisco Intrusion Detection System (IDS). I tried a few typical username/password combinations (root/root, admin/password, etc) before googling “Cisco IDS default password”.

Google

Sure enough, ‘root’/’attack’ worked, and the flag was given in a popup box:Infosec_flagis_?

At first glance, this looks like the string presented in their Level 4 CTF challenge, but the character spacing is all wrong. We already determined that they’re using the format “infosec_flagis_?????”, and they’re unlikely to change the grouping since it helps identify the flag in a CTF event.

The flag is presented in plaintext, but reversed. To undo this, use the “rev” command in Linux, which reverses a string passed into it:

echo "ssaptluafed_sigalf_cesofni" | rev

Answer: infosec_flagis_defaultpass

InfoSec Institute – CTF Level 9

InfoSec Institute – CTF Level 8

This is a walkthrough on Level 8 of InfoSec Institute’s CTF challenge. The challenge begins by asking if you’d like to download “app.exe”. Since I’m not about to run an untrusted *.exe file (and I’m on Linux anyway), I decided to open it up in emacs. The flags follow a common format, so performing a string search can’t hurt:

Screenshot from 2015-03-22 17:21:41

Well, that was easy.

This can also be done with the strings command, which prints strings of printable characters. Binary files do have quite a few readable characters, so combining strings with grep shouldn’t hurt (the -i flag means case-insensitive search):

strings app.exe | grep -i infosec

Gives the output:

Screenshot from 2015-03-22 17:27:40

 

 

 

InfoSec Institute – CTF Level 8

InfoSec Institute – CTF Level 7

This challenge is linked directly to a file called “404.php”, that serves up the following content:

f00 not found 
Something is not right here???
btw...bounty $70

This is intentional, and not an accidental 404, given the level-specific bounty and the fact that it’s linked directly in the menu. Let’s tryhttp://ctf.infosecinstitute/levelseven.php, since that’s what all the other levels are. Sure enough, it works. Kind of.

The page is blank, but instead of a 404 status code, we get 200. Well, not really:

Not a 200 (OK)

The HTTP status is 200, but the status text should be “OK”, so let’s see what it actually says:

Screenshot from 2015-03-22 17:12:18

Ahh, another base64-encoded string. We came across that in level 2, so we’ll just use that atob() function again:

Easy enough! But why does this work? Did they hack the internet?!

The HTTP status code is separate from the status text – they’re just commonly used together. We can generate the same effect with PHP’s header function.

http://cmattoon.com/fake404.php

<?php
die(header("HTTP/1.0 404 Just kidding, it's here."));

Fake

 

 

It’s important to note that some software (crawlers, for example) may only look at the status code. Generating random HTTP statuses because you can is generally not a useful thing to do in real life ;)

InfoSec Institute – CTF Level 7

InfoSec Institute – CTF Level 4

This is a writeup on Level 4 of InfoSecInstitute’s CTF challenge.

As discussed in Level 2, we begin with a general survey of the page. After checking over the page and not finding anything that stood out, I began inspecting the HTTP headers. (The hint, after all, is “HTTP stands for Hypertext transfer protocol”.)

HTTP Headers

The response looks about the same, with the exception of an additional cookie that wasn’t present before:

fusrodah=vasbfrp_syntvf_jrybirpbbxvrf

Note: If you’ve been browsing around their site, and came across this level before, the cookie WILL be set on all headers, since it’s set for all pages on the domain ctf.infosecinstitute.com. You can see this for yourself by looking at the cookie file. (In Chrome, visit chrome://settings/cookies)

Screenshot from 2015-03-22 17:06:37

Since this is clearly encoded, encrypted, or otherwise obfuscated, I made a few (correct) assumptions to find a starting point.

  • A low level (Level 4) challenge would use a classical cipher, rather than true symmetric/asymmetric encryption (ala PGP/AES)
  • A digraph (“bb”) and two groupings ending with “v(r)f” indicate to me that it’s likely a monoalphabetic substitution cipher.

Note that these assumptions come from having a bit of experience with cryptography, and are simply a starting point. Since we have a short ciphertext, it’s difficult to be sure about anything until we actually attempt decryption. While it could possibly be a Vignere, Playfair, or something more complex, starting off with a simple monoalphabetic substitution cipher seemed like the best bet.

The most well-known of all the classical ciphers is probably the Caesar Shift cipher (or ROT-13). I don’t think I’ve ever met anyone who wasn’t familiar with it, but it’s not a question I typically ask. To be precise, the Casear cipher is a specific key setting for a shift cipher, in which the letters of the alphabet are ROTated by a certain number. The ROT-13 cipher rotates the alphabet by 13 characters, resulting in the following shift:

Caesar Shift (ROT-13)

To solve, simply decode the value of the cookie. (Note that the ‘=’ sign is not part of the allowed message space, so the values will have to be decoded separately.)

infosec_flagis_welovecookies

If it wasn’t a key of 13, we could try each possible combination, looking for (hopefully English) text that makes sense. This is known as a brute-force attack, and would be the next logical step before moving on to additional cipher types. In fact, if you used an online tool or other program to solve it, it likely worked by generating each of the 26 possible decryptions, then checking which one(s) had one (or both) of the following:

  1. A character frequency distribution approaching that of the expected plaintext language.
  2. Common words in the expected plaintext language

Method 1 is the most accurate, given “enough” ciphertext. It’s a statistical comparison between how often certain letters appear in a language, and how often they appear in the ciphertext. When the ciphertext is short, however, this method can fail since there’s not enough data to accurately compare it.

The second method is especially useful in situations like this, where we know some of the plaintext (the words “infosec” and “flag”). In fact, this makes it easier to write the script on-the-fly if you’re not familiar with calculating frequency distributions, or are otherwise “not a math person”. By scanning each decrypted (deciphered) set of characters for these words, we can rapidly narrow down our search for the correct cipher and key. (I wouldn’t waste time searching for “is”, because it’s too short to be significant.)

And, of course, the mandatory Python script:

 

Which resulted in the following output:

Solver

InfoSec Institute – CTF Level 4

InfoSec Institute – CTF Level 3

This is another tutorial for Level 3 of InfoSecInstitute’s CTF challenge.This level involves decoding (not decrypting) some data to retrieve the flag. The challenge can be found at http://ctf.infosecinstitute.com.

Disclaimer: Capturing the flag for this level takes around 30 seconds with online tools, and is basically a no-brainer. Since this is about education and learning, and not blindly using tools, let’s dig into what’s actually going on here.

The message for this level is encoded with two different methods. The first method we have to decode is obviously the QR code, but some simple QR code readers will fail, since the encoded data is not a URL (hint).

Original QR Code
Original QR code

This article will touch upon the basics of manual QR code decoding, but is far from being a comprehensive source of information on the subject. For the basics, I’ll refer you to QR Code Essentials, a PDF document published by the creators of the QR code. If you’re looking for information on implementing the scheme, or want to explore the subject in-depth, I highly recommend thonky’s tutorial on the subject. It’s a very thorough article aimed at programmers and is invaluable in working with QR codes in software.

QR Code Basics

QR codes were developed as an alternative to the 2D barcode schemes (commonly used on driver’s licenses and such). It has several advantages over the 2D schemes and is much more resilient to human error and physical damage than other barcoding schemes. If you’ve ever tried to implement a barcode scanning system, you’re no doubt familiar with the headaches that come with it.

A QR code stores information in binary format – each “pixel” represents a 1 or 0.  In the interest of specificity, each square can obviously be much larger than a true pixel, and the black and white coloration doesn’t exactly correlate to 1 and 0, for reasons that will be explained shortly.  In the QR code world, each of these squares is referred to as a module (light modules and dark modules).

For starters, let’s take a look at the anatomy of a QR code:

Source: http://www.nacs.org/LinkClick.aspx?fileticket=D1FpVAvvJuo%3D&tabid=1426&mid=4802
Source: http://www.nacs.org/LinkClick.aspx?fileticket=D1FpVAvvJuo%3D&tabid=1426&mid=4802

One of the major strengths of QR codes is their error correction, which allows creative use and distortion of the code while maintaining its ability to be scanned.

qrcode

It does this in part by applying a bit mask to the data, which results in skewing the correlation between dark/light modules and 1/0 values.

masking-example
Example of bit masking in QR codes.

Essentially, the bit mask is determined by looking for areas of consecutive matching squares (penalty score). Since large areas of black/white space could make automatic decoding unpredictable or impossible, a mask is used to achieve a more uniform distribution of dark/light modules.

2000px-QR_Code_Mask_Patterns.svg
In this context, values i and j refer to ‘row’ and ‘column’ number. Taking a value modulo (%) 2 is a standard way of indicating even/odd values, hence the diagonal/checkerboard pattern for this mask.

Note that the bit mask applies only to the data and error correction areas, and not any of the reserved ones. For this reason, it’s sometimes referred to as the D/E mask.

Determining the Version

First, we’ll need to determine the version. The version, along with the error correction level, determines how much data can be stored in the QR code. Note that this is not a sequential version (like most things in the IT world), but rather an indicator of the dimensions.

There are 40 possible QR code versions, which have dimensions ranging from 21×21 (v.1) to 177×177 (v.40) modules. The version can be inferred by the width of the overall QR code (in modules) using the formula:

width = (version * 4) + 17

Or, conversely:

version = (width – 17) / 4 

In this case, the width of the QR code is 29, which makes this a version 3 QR code. Additionally, each version uses one of four possible error correction values:

  • L – 7%
  • M – 15%
  • Q – 25%
  • H – 30%

These error correction values indicate the amount of redundant data, or the maximum theoretical damage/defacement that can occur and still scan accurately. (A 40L QR code holds the most data and a 1H contains the least.)

barcode-image
QR Code with Level “Q” error correction (30%)
barcode-image2
This works too!
barcode-image3
Even this will scan!

 

Determining the Mask

Since the encoding type is masked (bottom four pixels), we have to determine the mask first. This is done by looking at the format pattern, which is masked by a different mask. The format mask is a specific 15-bit mask 101010000010010 (decimal: 21522) used only for this purpose. Of these 15 bits, we’re only interested in the first 5 (for now).

qrformat
The first 5 bits store the error correction level (2 bits) and the D/E (‘regular’) mask (3 bits)

 

In this specific case, it’s safe to assume dark=1 and light=0, which yields a binary value of 11101. Therefore, we’ll XOR the actual value 11101 with the first 5 bits of the format mask:

  11101
^ 10101
-------
  01000

This means the error-correction level is 01 (L) and the mask ID is 000. Error correction level L allows for around 7% correction, as well as a corresponding decrease in the amount of information that can be stored. This makes sense, after all, since those bits can no longer be used to store “real” data.

Content-Encoding & Maximum Data Length

We’re about to apply the mask, but let’s make some predictions first. Specifically, I’ll walk you through how to determine the encoding type and character count.

Using the tables available on this page, along with the version and error correction values, we can determine the maximum length of the data contained in the code:

QR Code data limits - version 1-3

We can expect up to 440 bits, 127 digits (0-9), 77 characters (A-Z, 0-9 + special chars), or 53 binary digits. (Kanji encoding is used for special Japanese characters. There’s also an additional ECI mode used for other character sets, but we won’t be using either.)

It’s worth keeping in mind that alphanumeric digits can also be represented as data bits. The number of bits per character depends on the encoding type, with ASCII encoding using 8 bits (7 required + 1 parity bit). Unicode encoding (multibyte) can use more, depending on the character set (hence the drop in numbers for Kanji characters). Various encoding tricks can be employed here, but that’s outside the scope of this article.

Data Encoding Type & Content Length

We need to determine the encoding type in order to determine the maximum data length. The encoding type and data length are encoded in the first 13 bits of the message, starting from the bottom-right. Bits in a QR code are read in a zig-zag pattern, two columns at a time:

QR code with reserved areas redacted
The yellow areas are redacted in preparation for unmasking.

The four encoding methods (mentioned above) have the following mode indicators, which are stored in the first four bits. Again, the values in the above image are masked, and will not correlate with these binary values (yet):

  • Numeric – 0001
  • Alphanumeric – 0010
  • Bytes – 0100
  • Kanji – 1000

After this is a 9-digit (binary) value representing the character length. It’s padded on the left with zeros to ensure a consistent length. For example, the binary value 1 would be encoded as 000000001.

We can take a quick peek at the data by applying the mask to the bottom corner. I’ll discuss this next, so you’ll have to take my word on it for now. Also, note that the black/white values have changed from above, due to the image processing.

corner
Black = 0, White = 1

Starting from the bottom right, this gives us a result of:

0010 001001010

The first four bits are the encoding type (alphanumeric), and the remaining nine bits (1001010 after removing padding) convert to 74 in decimal. We should get 74 alphanumeric characters once this is fully decoded.

Applying the Mask

The image on the left is the original QR code with reserved areas removed. On the right is mask 000, which we’ve determined to be the appropriate mask for this code.

Step1

For contrast, I’ve changed the mask to red before layering it on the original.

mask-overlay

I’ve applied the mask visually, using the “difference” filter in GIMP. This essentially applies an XOR to the two layers, but some additional inversion is required. I won’t explain the process behind it, but you can see it applied for an OTP cipher here. The grids were generated in OpenOffice Calc, and I’ve included a link to the original at the end of this post.

At long last, the unmasked data:

final
This is the unmasked image, after a bit of processing. (Black = 0, White = 1)

Next, the bits shown above need decoded according to their respective schemes. I’m going to skip the discussion on character encoding, since there’s enough room for at least an entire other article on the subject.

The Short Answer

As I mentioned at the beginning of this post, there are several online tools that can decode this easily. In fact, there are tons of tools available to do the decoding. Since no article of this nature would be complete without an Python code example, you can also use the python-qrtools package (Ubuntu).

Encoding #2 – Morse Code

It turns out that the QR code was encoding a series of periods, hyphens and spaces (Morse code).

.. -. ..-. --- ... . -.-. ..-. .-.. .- --. .. ... -- --- .-. ... .. -. --.

Now that we’ve got the Morse code, we simply need to decode it:

morse

(Hover for answer)

InfoSec Institute – CTF Level 3

InfoSec Institute – CTF Level 2

This article is a solution to Level 2 of InfoSecInstitute’s CTF challenge. The challenge can be found here.

Although the solution is actually very simple, I’m going to describe a number of other steps to consider. If you just want the final answer, feel free to skip to the end.

Step 1 – Static Source Code Analysis

The first step is to look at the underlying source code. I’ll typically do a quick scan to see if anything is out of place or poorly-hidden. Incomplete or shoddy cover-ups can serve as a red flag to identify areas that need further investigation.

There’s some obfuscated JavaScript near the bottom of the page, but it’s easily recognized as just a Google Analytics snippet. If I wasn’t aware of that, I’d search for the string “function(i,s,o,g,r,a,m)”, which, sure enough, yields plenty of results.

The next chunk of code is a generic asynchronous loader function for the domain pardot.com. Oh, it’s just some marketing code from a (probably legitimate) company. I’ve never heard of them, but I’m not immediately suspicious.

There are a couple of unaccounted-for variables in the second snippet, but they appear to be consistent across pageviews, so I’m assuming it’s a unique identifier of some sort. I’ll mark a note to return here later, if needed.

Finally, there’s this:

Screenshot from 2015-03-12 19:28:31

The “leveltwo.jpeg” image is inside the “lvlone” div. This could certainly be a simple typo, but typographical errors break things, too. Since there’s nothing super solid to go on, and the message references the image file, I’ll start the next step with this.

Step 2 – Checking Dependencies

Any number of scripts (or other resources) can be included in the request. Although it’s not at all uncommon for people to host their own libraries versus using a content delivery network (CDN), seeing local paths to scripts and CSS files for common libraries should be investigated further.

Screenshot from 2015-03-12 19:41:52

If we trust the CDN, we can trust that the content it delivers is the actual jQuery library or Bootstrap CSS file, and not a malicious script that someone simply named “bootstrap.min.css”.

Screenshot from 2015-03-12 19:42:58

This doesn’t account for MITM-style vulnerabilities, where someone listening on your network could supply an altered copy of these libraries. 

The local files could be verified with an md5 hash, but I’m more interested in the image at this point. I’ll save that for later if nothing else turns up.

Open up the developer console (F12 in Chrome), and view the “Network” tab. This tab shows all of the individual components required to render the page.

Screenshot from 2015-03-12 19:48:43

 

The most interesting thing to me is the 200 (OK) HTTP response from the server. This indicates that the image does exist, but it’s still not showing up. If the image didn’t exist, I’d expect a 404 (Not Found) response, which is what is returned with this image tag:

You shouldn't actually see an image here

You’ll see the following requests for this page:

Screenshot from 2015-03-12 19:50:39

Note: If you receive a 302 (Not Modified) response instead of a 200, it simply indicates that the image is already saved in your browser’s cache. The server is telling your browser there’s no need to download a new copy, since the image hasn’t changed. Ctrl+F5 will force the browser to get the image from the server.

Screenshot from 2015-03-12 19:53:19

This leaves us with a missing image that isn’t actually missing. Since the image can’t be right-clicked on and saved, we can rule out a fake image of a missing image, like this one:

Screenshot from 2015-03-12 20:09:37

Open the image in a new tab by right-clicking on it and selecting “Open in New Tab/Window”. Developer tools confirms that the image exists, since we get another 200/302 response:

Screenshot from 2015-03-12 20:16:05

The response headers are curiously missing a Content-Type header, which is usually present. Specifically, I’d expect:

Content-Type: image/jpeg

Again, this is nothing super-concerning, just another subtle clue that something is amiss. It could be a simple oversight or server misconfiguration that’s causing the image to not be displayed, or it could be a corrupt image.

By clicking on the “Response” tab, we can view the raw data received from the server, which is:

Base64 Result

This is an easily-identifiable base64-encoded string, which is a valid way of sending images on the web. If you’re not familiar with this method, open the following link and inspect the Response.

http://cmattoon.com/fakeimg/apple.png

Notice the difference in the string lengths? There’s not enough data in the leveltwo.jpeg file to be a valid image (probably). Instead, it looks like an encoded string (text).

The image above (apple.png) isn’t actually an image file. It’s a PHP script that outputs a base64-encoded image.
I’ve used an .htaccess rewrite to make it appear as an image at first glance, and for all intents and purposes, it is an image!

The real file is located at:

http://www.cmattoon.com/fakeimg/apple.php

Not linked to an image file!

Remember that missing Content-Type header? That’s the missing piece that tells the browser to interpret it as image data, and not a bunch of garbled characters. Here’s the same script, but without the Content-Type header:

http://cmattoon.com/fakeimg/apple2.php

Here’s a comparison of the two, with and without the Content-Type header:

Screenshot from 2015-03-12 20:43:13
This script includes the Content-Type header, thus displaying an image.

 

Screenshot from 2015-03-12 20:43:25
Without the appropriate header, the image data is interpreted literally, as text.

You can find the source code for these test files on GitHub.

 

Solution

The solution is to simply decode the string that’s posing as an image. You can use an online tool, or switch over to the “Console” tab of the Developer Tools and decode it in JavaScript with the window.atob() method.

Screenshot from 2015-03-12 20:57:05

That’s all there is to it! Now, go find the string and decode it!

(If you’re too lazy for that, click the image below)

Screenshot from 2015-03-13 08:32:33
Solution (Click to enlarge)
InfoSec Institute – CTF Level 2