s0litaire
By:
s0litaire

How to Import large Mysql databases into a small droplet.

April 12, 2017 1.4k views
MySQL Linux Commands Ubuntu 16.04

I'm trying to import my Wordpress database into my droplet. (512mb small droplet)

I'm particularly having trouble with one of the tables, it's 1.6gb table filled with image blobs.

When i run the usual
mysql -user -p wordpressdatabase < wordpressdump.sql

It runs fine until it hits that large table, then a few seconds later the Mysql serve "goes away" (crashes?!?) and I can't connect to it for over 10 mins.

And I hit a timeout if I try to import that specific table using MyPHPAdmin.

Is their a way of importing that large table in the small droplet?

3 Answers
jtittle1 April 12, 2017
Accepted Answer

@s0litaire

Your mileage may vary with this one, though one method you may be able to use without manually breaking down each table is split. This command will allow you to break your SQL dump down in to multiple smaller SQL files which you can then pipe to MySQL.

To test this, I created a dummy single-table database called posts which was 2.9GB worth of various data, dumped it to an SQL file called posts.sql and then transferred it over to a 512MB Droplet with MariaDB installed (stock installation, no configuration changes/optimizations).

On the 512MB Droplet, I created /usr/local/src/db using:

mkdir -p /usr/local/src/db

Pulled the posts.sql dump file over to that directory using wget, so the file is:

/usr/local/src/db/posts.sql

I then created a splits directory:

mkdir -p /usr/local/src/db/splits

and ran:

split -n 250 posts.sql splits/sql_

That splits the posts.sql file in to 250 chunks, places it in the splits directory and tags it with sql_, so each split looks like sql_a..... You could technically split it down even more, but I'm not sure how much it'd help.

You can run ls /usr/local/src/db/splits to confirm the count.

Once you have the splits, you can run:

cat /usr/local/src/db/splits/sql* | mysql -u dbuser -p dbname

And that'll pipe all of the data to the database. It's not super-fast (at all), but short of manually breaking the database down to chunks of tables, it's one option that I can confirm worked on a 2.9GB table.

The table was created and filled using:

DROP TABLE IF EXISTS posts;

CREATE TABLE `posts` (
  `author_id` int(11) NOT NULL,
  `title` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
  `description` varchar(500) COLLATE utf8_unicode_ci NOT NULL,
  `content` text COLLATE utf8_unicode_ci NOT NULL,
  `date` date NOT NULL
)
....
....
....
....

Where .... is quite a few insert statements (to the tune of 2.9GB) - roughly 17 million.

That said, it did cut it close. This knocked the free RAM on the Droplet down to about 15MB (with 272MB being cache) and a consistent load of about 1.0 on the CPU (thus maxing it -- the load mainly comes from I/O wait time), but it didn't crash, fail, error out, etc.

That said, again, it's slow. You may be waiting an hour or more using this method, so don't plan on it being as fast as it would be using the normal restoration methods.

If you're running anything else on the Droplet (Apache/NGINX, PHP-FPM, etc), I'd stop the processes first and only have MySQL running while using this method.


General note, if you're not using Ubuntu's snapd service, you can free up a little ram by disabling it.

systemctl stop snapd
systemctl disable snapd

snapd allows you to install certain packages provided by Ubuntu, much like apt, but for most, it's not needed nor a critical service. If you don't use it, or you've never heard of it, you can disable it without any issues.

On my 512MB Droplet it was using about 10MB of RAM with the ability to use as much as 200MB. Not really something I'd keep running on a memory constrained system.

  • Many thanks!

    Speed is not an issue so i'll give this a go ASAP.

    • @s0litaire

      No problem at all. From what I can see, across a few tests, the average is 1-4 hours with a 2GB+ dump. It seems to vary a little depending on what is actually happening during the restoration -- i.e. size of tables, rows, columns, etc. But it should do what you need it to do

  • Quick update. (Snapd is never going anywhere near my servers for now! ) ;)
    Shut down all services that were not required.

    I was worried that something went wrong after it only took 1m 46s to import the 1.6Gb table!

    So to be sure it was working I dropped the table and re-ran the import a few times to make sure all 73,801 rows were imported correctly.

    Look liked it worked perfectly!

    Thanks again!!

    • @s0litaire

      Nice! Glad to hear that helped :-).

      The time factor may have been simply due to the number of inserts I had going on with the test dump -- 17 million inserts is quite a bit -- though I wanted to get above or at a similar size to make sure it'd work properly.

@s0litaire

Setting up swap on a VPS (Droplet) isn't recommended (by DigitalOcean), so there's a few options. The first would be upgrading your Droplet so that you have more RAM available. With a growing DB size, upgrading would probably be more beneficial now and in the long run.

This is confirmed in the guides provided by DigitalOcean (i.e. to not use swap), such as the one here. Check the red box at the start of the guide.

The second option would be to optimize MySQL's configuration to suit your needs. The default config for MySQL isn't exactly production ready or intended for long-term use -- it's a starting point. The issue you may face is that you don't have a lot of RAM to work with, so when modifying key_buffer and other configuration options, you may run in to more issues that you're having now. Tuning MySQL or MariaDB on low-RAM VPS's or servers (in general) takes time and testing.

The reason you're seeing timeouts when using phpMyAdmin is due to upload and post size limits that are defined in PHP's php.ini file. If you're using NGINX, client_body_size also plays a role and with it set, the limits, even if substantially raised in php.ini don't matter -- NGINX takes precedence (if you're not using NGINX, then client_body_size doesn't matter here).

One of the easiest way of increasing the responsiveness of your server and guarding against out of memory errors in your applications is to add some swap space. In this guide, we will cover how to add a swap file to an Ubuntu 16.04 server. &lt;$&gt;[warning] [label...
  • Thanks I'll have a look at the settings.

    On a tight budget, so upping the droplet size is a no go for me at the moment.

    2 options currently thinking about:

    1) Dump sections of rows of the table into standalone .sql's.
    So i only have to import a small section (trying to get them smaller than 200mb per section) of the table at any one time. (I know their is something i need to add to the import statements to let the database update a single table from multiple updates but can't seem to find it at the moment!!)

    2) Use my local copy of the database to edit the affected tables to use links rather than image blobs, then upload that. In the long run this would be better for database size, just not sure the bet way to go about this.

    I'll keep google'ing other options.

Optimise your services to use less memory, add SWAP or upgrade to a bigger droplet.

Have another answer? Share your knowledge.