Knowledge base - Posts tagged with: plonk
Its not the servers - it is your braindead Wordpress developers!
At Zubr Communications most of our business is designing, deploying and maintaining highly critical infrastructure that carries flashy things others put on a top of it. Some say we do hosting. I say we do the Internet plumbing and we do it very well. We may not be experts in Flash or the latest VRML tricks, but we know how the gear ( be it server or network ) actually works, why it works the way it does and what it means for the applications and the OS that runs on it.
So there's probably nothing that concerns us more than those customers who choose to continuously shoot themselves in the foot by listening to web designers/developers/flakes who between watching Jerry Springer and Maury somehow obtained their "computer and networking degree" from a diploma mill. Sooner or later these "developers" do something that results in the sites going down, and we have to explain yet another time that "No, it was not our servers that caused it. It was your software".
On HP DL140 G3, PS/2 keyboard and mouse... are USB devices.
While tuning the OS is more art than science, deploying a tuned OS on a known platform should be a piece of cake. It was not the case for the upgrade of a customer's HP DL140 G3 to a new clean and tight packaging of Linux optimized to the hardware and specific tasks this particular server was supposed to perform...
[zdeploy@deploy-master] $ zdeploy –target border1.phl2/3:2 –load_target 192.168.2.22 –payload f13-zubrcom-small-web-x64
9 minutes later the system rebooted and came back on the network with 64bit Fedora 13. Everything seemed fine but the keyboard. It was solidly locked with Num Lock lit.
dmesg indicated that the PS/2 keyboard simply was not found.
"If it does not scale, it is broken by design"
Today a server of a customer with fantastic uptime suddenly lost its MySQL process while the customer was in the middle of a minor tweak of the WordPress platform.
Investigation revealed that the InnoDB storage engine was not able to allocate memory pages for a routine operation and in its most bizarre way of handling errors did a safe crash of the MySQL server ( No, there is no such thing as a "safe" crash, so please dear MySQL folks add sane error handling or stop pretending you are an "industrial" strength SQL server!)
Further conversation with the customer revealed that the developer, following an example in PHP Manual, decided it was a good idea to do this:
A tale of backup transit...
A Senior Sales Manager of an unnamed company that claims to be a "regional leader in business connectivity" familiar with our requirements for backup transit ( gige via in building cross-connect, BGP, low CIR, etc.) tells a Senior Sales Droid to call us with a quote. The conversation went like this:
Zubrcom: I need tansit over gige PNI at 401 N. Broad. Can you do that?
Senior Sales Droid: You need PRI?
Zubrcom: No, I need transit over gige PNI.
Senior Sales Droid: Private Network Interface.
Zubrcom: Yes.
Senior Sales Droid: What do you need that for?
Zubrcom: Transit.
Senior Sales Droid: Between where and where?
Sun burning through the clouds....
When a sales droid is selling you virtualization as a way to save over your clueful service provider, not only is he selling you the rainbows and magic, but also this level of availability:
Mon Jul 26 08:46:04 2010|http://www.importantsite.com|Failure|Code: 500|61 second(s).
Mon Jul 26 08:48:03 2010|http://www.importantsite.com|Failure|Code: 500|61 second(s).
Mon Jul 26 08:48:47 2010|http://www.importantsite.com|Failure|Code: 500|45 second(s).
Mon Jul 26 08:49:11 2010|http://www.importantsite.com|Failure|Code: 500|9 second(s).
Mon Jul 26 08:50:04 2010|http://www.importantsite.com|Failure|Code: 503|1 second(s).
Mon Jul 26 08:51:03 2010|http://www.importantsite.com|Failure|Code: 503|1 second(s).
Mon Jul 26 08:52:03 2010|http://www.importantsite.com|Failure|Code: 503|1 second(s).
Mon Jul 26 08:53:02 2010|http://www.importantsite.com|Failure|Code: 503|0 second(s).
Mon Jul 26 08:54:03 2010|http://www.importantsite.com|Failure|Code: 503|1 second(s).
Linux Networking Housekeeping
Routing Table Cleanup
While 169.254.0.0/16 route shows up only on machine with multiple NICs it has no place on the systems at all. Get rid of the annoying Zero Configuartion crap by adding the following lines to /etc/sysconf/network:
NOZEROCONF=yes
MySQL: Fix bad connection error defaults
For whatever reason MySQL server software ships with really stupid defaults for handling problems with the incoming connections. Obviously, Zubrcom's intallation of MySQL does not suffer from this problem. However, if you have braved rolling your own installation it may be helpful to add the following lines in /etc/my.cnf:
[mysqld]
max_connections = 1100 # max number of clients if each client is non-threaded
max_connect_errors=99999999 # dont stop mysql before it gets this many connection errors.
max_user_connections = 1100 # max connections per user
If MySQL is already running, login into root account and use comand SET GLOBAL to change the variable setting without restarting MySQL.

