2005-12-23

i'm productive

I learned perl. I wrote a client and server program in perl with an extremely lightwieght (fast but not especially secure or reliable). The protocol is very simple. It uses TCP/IP. It is simply data ended with an EOF. So what's the point? You can pipe anything into the client, and expect it to be piped out of the server (obviosly, you could just have them use keyboard & moniter, but then you'd just be sending messages to the other person; i suppose you could set up one of these in each direction and carry on a conversation. primitive IM!)
Anyway, what's the point? If I'm running tar or something like that, and I want to ultimately have the tarball on another computer, and the computer I'm running tar on doesn't have enough room for the file, I just pipe that info into the client, and then have the server running on the machine with the big hard drives and pipe its output to a file there. That's the main point of it, but it could also be used for sending files (but since you probably should be running in an SSH tunnel anyway, it would be much better to use scp).
The two biggest issues are data integrity and security. There are so far no known bugs that cause issues, but since this is over a network, there can always be a bit here or there messed up. Since it uses TCP, there probably won't be any lost data, but don't count on it. This is NOT AT ALL secure. DO NOT run the server on a port open to the public. DO NOT use the client to directly send any data that you wouldn't want others to see over a public network. While designed for use on a small LAN, it can be safely used over the Internet by using SSH tunnels. They MIGHT even solve the integrety issues (SSH might use checksums or something).
As for speed, the biggest changeable factors are the method, blocksize, and location of compression; the client blocksize; the server blocksize. Both the client and the server allow you to set blocksizes (in bytes). It will read that much and then write it all at once. The buffering features of perl are bypassed, because they seem to cause problems. It's probably best to have the client blocksize be a multiple of the payload size of a TCP packet (thereby getting the most out of every packet). If compression is being used on the client side, the blocksize should be a multiple of that. I'm not quite sure how one should pick the server blocksize, but I like to set it equal to the client's. Keep in mind that the memory usage of both the client and the server is more or less the blocksize (since it's effectively buffered). I think that if you have much memory to spare, you should probably use a big blocksize. It would make sense that bigger blocks use less CPU (since fewer loops, which include 1 read and 1 write).
Both programs take 3 arguments: host, port, and blocksize. The server takes a host so that it knows which to bind to, but if you want it binding to any IP, I think you can achieve that by modifying the script. STDIN on client and STDOUT on server are obvious. Vice-versa should not be used. STDERR contains the block counter (tells you how many FULL blocks have been sent/recieved so far (with small blocksizes, the one on the client&server should be very close, but with larger ones they might not be). Any errors thrown by the script or by perl itself should also go to STDERR. In most cases, it should die on any error. (although if you are running perl -w, which I have it as since it is alpha-quality, there will be other warnings thru STDERR that won't make it die).
I'll post these programs later under GPL.

My other big accomplishment is setting up a RAID 5 array using mdadm. Right now I have RieserFS 3.6.

Right now I am testing out both on the krisa compy. So far I have had one very strange data-integrity issue. I'm going to try partimaged. I'm really hoping it's just my stupid script's fault and not something else that someone else wrote that should work. Especially not the RAID.

2 Comments:

At 12/26/2005 01:00:00 PM, Anonymous Anonymous said...

Did you break the 1 Tb mark with the RAID 5 array?
Any name yet for the prog?

 
At 1/16/2006 07:13:00 PM, Blogger Chiris said...

no i haven't; insufficient funds. ~240GB...24% there!
no i haven't; the scripts are called client.pl and server.pl, or send.pl and recv.pl
i should be releasing it soon though. thanks for the reminder.

 

Post a Comment

<< Home