- January 2025 (6)
- December 2024 (7)
- September 2024 (1)
- August 2024 (2)
- July 2024 (2)
- May 2024 (2)
- April 2024 (2)
- February 2024 (2)
- April 2023 (1)
- March 2023 (2)
- September 2022 (1)
- February 2022 (1)
- November 2021 (1)
- March 2021 (1)
- February 2021 (2)
- August 2019 (1)
- November 2018 (1)
- May 2017 (1)
- December 2016 (1)
- April 2016 (1)
- August 2015 (1)
- December 2014 (1)
- August 2014 (1)
- March 2014 (1)
- December 2013 (1)
- October 2013 (3)
- September 2013 (4)
- August 2013 (2)
- July 2013 (1)
- June 2013 (1)
- February 2013 (1)
- October 2012 (1)
- June 2012 (1)
- May 2012 (1)
- April 2012 (1)
- February 2012 (1)
- October 2011 (1)
- June 2011 (1)
- May 2011 (1)
- April 2011 (1)
- March 2011 (1)
- February 2011 (1)
- January 2011 (1)
- December 2010 (3)
- November 2010 (1)
- October 2010 (1)
- September 2010 (1)
- August 2010 (1)
- July 2010 (1)
- May 2010 (3)
- April 2010 (1)
- March 2010 (2)
- February 2010 (3)
- January 2010 (4)
- December 2009 (2)
- November 2009 (5)
- October 2009 (2)
- September 2009 (2)
- August 2009 (3)
- July 2009 (1)
- May 2009 (1)
- April 2009 (1)
- March 2009 (5)
- February 2009 (5)
- January 2009 (5)
- December 2008 (3)
- November 2008 (7)
- October 2008 (4)
- September 2008 (2)
- August 2008 (1)
- July 2008 (1)
- June 2008 (1)
- May 2008 (1)
- April 2008 (1)
- January 2008 (5)
- December 2007 (3)
- March 2007 (3)
- February 2007 (1)
- January 2007 (2)
- December 2006 (4)
- November 2006 (18)
- Programming (71)
- Python (47)
- AI (25)
- Resolver One (22)
- Resolver Systems (18)
- Linux (13)
- NSLU2 offsite backup project (13)
- PythonAnywhere (12)
- Funny (11)
- Fine-tuning LLMS (10)
- Business of Software (9)
- Gadgets (8)
- Robotics (8)
- LLM from scratch (7)
- Meta (7)
- Finance (6)
- 3D (5)
- Blogging (5)
- Personal (5)
- Uncategorized (5)
- Music (4)
- Oddities (4)
- Rants (4)
- Website design (4)
- Admin (3)
- Dirigible (3)
- Eee (3)
- Politics (3)
- Talks (3)
- GPU Computing (2)
- JavaScript (2)
- Memes (2)
- OLPC XO (2)
- Quick links (2)
- Space (2)
- VoIP (2)
- Copyright (1)
- Django (1)
- Musings (1)
- Raspberry Pi (1)
Project: Automated offsite backups for an NSLU2 -- part 10
Previously in this series: Part 1, Part 2, Part 3, Part 4, Part 5, Part 6, Part 7, Part 8, Part 9.
I'm setting up automated offsite backups from my NSLU2 to Amazon S3. With suprisingly little effort, I've managed to get a tool called s3sync running on the "slug" (as it's known). s3sync is a Ruby script, so in order to run it, I had to install Ruby, which in turn meant that I had to replace the slug's firmware with a different version of Linux, called Unslung. All of this worked pretty much as advertised in the tools' respective documentation -- for the details, see the previous posts in this series.
As all of the pieces were in place, I next needed to do some simple tests to make sure it could handle the kind of files I wanted it to back up. In particular, I wanted it to be able to handle deep directory hierarchies, and to remember user and group ownership and file permissions.
The first step was to create some test files.
# cd /tmp
# mkdir testdata
# cd testdata
# mkdir directorynumber1
# cd directorynumber1
# mkdir directorynumber2
# cd directorynumber2
...
# cd directorynumber21
# pwd
/tmp/testdata/directorynumber1/directorynumber2/directorynumber3/directorynumber4/directorynumber5/directorynumber6/directorynumber7/directorynumber8/directorynumber9/directorynumber10/directorynumber11/directorynumber12/directorynumber13/directorynumber14/directorynumber15/directorynumber16/directorynumber17/directorynumber18/directorynumber19/directorynumber20/directorynumber21
# cat > file000
000
# chmod 000 file000
# cat > file644
644
# chmod 644 file644
# cat > file777
777
# chmod 777 file777
# chown guest:nobody file777
# chown bin:administrators file000
# ls -lrt
---------- 1 bin administ 4 Nov 14 2006 file000
-rw-r--r-- 1 root root 4 Nov 14 2006 file644
-rwxrwxrwx 1 guest nobody 4 Nov 14 2006 file777
#
So, I had some files with differing permissions and ownership, at the bottom of a directory hierarchy with over 350 characters in it -- I had a vague impression that there might be a 200-character key limit on S3, and I'm always worried about 255-character limits, so 350 seemed like a sensible test length; if a system can manage 350, it can probably manage much larger figures, up to 32,767 or so... Anyway, the next step was to sync the whole thing up to S3:
# cd /tmp/s3sync/
# ./s3sync.rb -r /tmp/testdata <my key ID>.Test:yetanotherprefix
#
A quick check with jets3t Cockpit confirmed that everything was uploaded with appropriate-looking keys, and also with properties specifying decent-looking integer owner, group and permission values. This looked good -- no key-length limit issues. However, there was only one way to be absolutely sure that it was working:
# ./s3sync.rb -r <my key ID>.Test:yetanotherprefix/testdata/ /tmp/copytestdata
#
(Note the positions of the slashes, etc. -- the full syntax for s3sync can take a while to work out, but the README documents it well if you take the time to read it...)
And then, to confirm that it's OK:
# cd /tmp/copytestdata/directorynumber1/directorynumber2/directorynumber3/directorynumber4/directorynumber5/directorynumber6/directorynumber7/directorynumber8/directorynumber9/directorynumber10/directorynumber11/directorynumber12/directorynumber13/directorynumber14/directorynumber15/directorynumber16/directorynumber17/directorynumber18/directorynumber19/directorynumber20/directorynumber21/
# ls -lrt
-rw-r--r-- 1 root root 4 Nov 14 01:03 file644
---------- 1 bin administ 4 Nov 14 01:03 file000
-rwxrwxrwx 1 guest nobody 4 Nov 14 01:03 file777
#
...which all looked correct!
So now I knew that s3sync would work from the NSLU2 to Amazon S3, that the file attributes I cared about were being persisted, and that deep directory hierarchies were not a problem. The next step would have to be to get it working with full SSL, as I don't really want my private data flying over the public Internet unencrypted, and then to put the whole thing into a shell script and schedule a cron job to sync daily.
Next: SSL, and scheduling part 1.