Thursday, January 30, 2014

My Last Post On Blogger


I noticed today that all my blog posts have no images. They are gone. After some digging I found a thread on google support where they say:

"Hi everyone, we've recently added a confirmation dialog to Google+ that asks you to confirm if you want to permanently delete your album and its photos from all Google products, including Picasa Web Albums, Blogger, and others. We can't restore albums you've deleted, but we're working to try to prevent this from happening in the future"

Yes, I guess I was purging some 5000 pics from my drive to avoid paying for extra storage, but it is beyond me why someone would think that storing real images (not some symlinks) for published blog articles along with some auto-uploaded photos from the cell is is a good idea, and protect it by the same kind of 'are you sure?' controls.

I am moving away from Blogger, and this is my last post on this platform.

And so my parting words to the Google person who decided to store live images for published blog articles in my Google Drive or Google+: buddy, I am really trying to be professional about it, and buddy, I know it was me who did not fully read some warning where you listed 'Blogger' on 25th place in the list instead of saying 'If you delete this folder your published blog posts will lose all illustrations!', but buddy, I lost images for several years of posts, so buddy - you are a fucking moron.



Wednesday, August 7, 2013

Total Failure: New AWS Console UI


A couple of months ago AWS started rolling out new version of its UI for the AWS console. In short, this is yet another illustration of an articulate idiot problem I wrote about a couple of years ago.

So, why do I say that IMO new AWS console UI is a total failure? If I were to condense it into a single sentence it would be this: old UI was about functionality, new UI is about looking nice.

The old UI was like a tool written by developers for developers - plain-looking but very condensed and packing all the useful info I needed when and how I needed it. Unfortunately I don’t have access to old UI anymore, and this is the best I could find on the internet:

Let’s take a look at how it looks and works now with the new UI:
Notice the there is no right pane. So when I click on a bucket the bucket list goes away and is replaced by file list.
Which means when I am looking for a certain file without knowing its bucket or just peeking at the buckets I have to go back to list of all buckets, and click on yet another bucket, and then repeat that for N buckets. Why? Because the person who designed it never used it in real life situation when things like this do happen.

So you might say that this is not a good example. Ok, let’s try to do more work. Let’s say I want to change some properties of this ‘css’ folder. In the old UI I would just click on Properties button, and the properties pane would always stay, with the properties shown right away. Let's click on Properties in the new UI:

See something useful here? Dear user, with the new improved and more user-friendly UI you have to make one extra click on ‘Details’.

Let’s take another example. Suppose I want to view properties of the bucket. I click on the bucket name:

Notice how you see all important properties at a glance. That is, if you start clicking on each section one by one. Oh, and when you click it’s accordion style, i.e. if you click on Permissions, and then click on ‘Static Website Hosting’ the Permissions will become collapsed again.

In conclusion, let’s take a look at new and improved Elastic Beanstalk:

What I really care about here is just status and recent events. Now notice how much vertical space is lost to convey so little, leaving just a little bit at the bottom for recent events. Which means I am always forced to click on ‘Show All’.

So, while I am sure the new UI definitely looked impressive in screenshots and on dummy accounts it was never designed by a person who is actually using the system on a day-to-day basis.

I am tagging Amazon (AWS division) as yet another company where an articulate idiot can thrive and mess up user experience of thousands of users.

Thursday, February 21, 2013

We Should Expect To Pay For Small Open Source Projects



I love open source. I use open source. I contribute to open source; I wrote SimpleDB and DynamoDB Grails GORM adapters for AWS. And I am not sure I will do anything like that again. You see, there is a big problem with Type 0 Open Source Projects (read below).

A recent post by Marc Palmer coincided with what I was suspecting for a long time, and acted as a catalyst to this post, firming up my belief that we need to start supporting open-source developers by paying some small amounts as a required step, not voluntarily.

First, let’s take a step back and try to classify all various kinds of Open Source Projects (OSPs). This is my own classification based on my understanding of how things go, and I could have missed something or misinterpreted something. Please let me know if there are inaccuracies below.

Types/stages of Open Source Projects (OSP):


  • Type 0 - part-time enthusiasts - very small OSP (for example, grails plugins) typically developed by 1/2/3 people, often in a very initial stage, often with minimal features and relatively low adoption. Typically developers are not working full time and are employed somewhere else.
  • Type 1 - consulting-financed - an OSP which started getting a lot of adoption and publicity (for example, compass/elasticsearch three years ago, or JBoss in 2001/2002). Because of the widespread usage, the core development team can now afford to create a consulting company and focus full-time on actively improving/developing the project while being hired by early adopters to solve problems and add high-priority features.
  • Type 2 is the ultimate goal of Type 1 projects - acquisition/investors/IPO - the developers are waiting for a large entity to buy out the consulting company (VMWare buying SpringSource, RedHat buying JBoss etc) or to get an investment from VC with the goal of being acquired later or going straight to IPO. After that developers have equity and are paid good salaries to develop the project. At this point the developers are forced to work on two streams:
    • supporting and improving core project features (version 1.3->1.4->1.5 etc) - in open source domain
    • developing paid proprietary customizations of the project (Redhat Enterprise) or working on some sort of proprietary platform (for example CloudFoundry hosting platform for VMWare)

Progression from each stage to the next is in the order of magnitude harder in terms of required number of installations and features. 

What this basically means is that if a developer is working on Type 0 project (pretty much part-time) the only way of making any money to focus on continually working on the project is it either being so *widely* popular that donations make it feasible (don’t know too many examples) or … and here is the problem. Unless the developer manages to convince his day-time employer to use the project as a component in the employer’s IT ecosystem and thus pay for the day-time improvements to the OSP, I don’t know any other way of how a developer can sustain a Type 0 OSP.

Out of 900 grails plugins too many are unsupported or exist with just a rudimental set of features, and so oftentimes I am simply afraid to start using one and end up writing the functionality myself. I strongly suspect there is a similar situation with plugins (Type 0 OSPs) for other frameworks and platforms.

So for Type 0 OSPs we have a situation where none of us wins. And the only solution that is sustainable is to pay those part-time developers who are working really hard on making something new. It should be up to the developer of the OSP of course to decide which model to use, but the development community should change the mindset to now accept/expect two basic modes for all Type 0 OSPs:

  • status-quo - everything is totally free
  • required purchase of a license. There are many variants of how this can be done (for example, free for dev testing and required license for all customer-facing/public internet deployment etc), but the goal is to force users of Type 0 OSPs to pay some token amount of money like $15 or even $5 to the developer.

Without the acceptance of the fact that we need to pay for Type 0 OSPs a lot of good things are left to wither and die. By paying just a little more than none we will see a much stronger and healthier Type 0 OSP ecosystem.

Do you agree? Disagree? I would love to hear your thoughts.

Wednesday, December 5, 2012

We need app/device/websites interlinking


I find my current computer experience very frustrating. For example, when I want to check my gmail I have to open a browser and click on some bookmark and possibly login (or re-login as another user if my spouse is currently logged in in her gmail). Or when I am searching for something the way I have to click on a link by link and open them in new tabs and then close some of them when there are too many tabs.

And the root of the problem in my opinion is the lack of interlinking. My biggest irritation is that nothing really changed in the last 40 years - we still work with our computers via flat screen using some kind of window-oriented solution where each window represents an application, each application is generally unaware of the rest of the running applications, and we control the process within each application on a very minute level of detail by clicking the mouse or pressing something on the keyboard.

Taking my previous gripe as an example, my desire to check gmail is called in my head simply a 'check-my-email' scenario but when it comes to realizing it practice I have to do all the tiny actions involving my moving and clicking the mouse to open a browser or a new tab in a browser, typing my password, clicking Login button etc.

Furthermore, when in front of the computer I very seldom work with just one application exclusively (okay, except for when playing Unreal Tournament and watching Netflix) - most of the time I am solving some kind of a problem which involves more than one application or a website, i.e. work in scenarios:
  • going out for a movie - check rating on metacritic.com, if rating is more than 60% then find showtime around 7pm near my home (and possibly, though I very seldom do it - buy ticket on fandango and print it).
  • working in my Intellij IDEA on java code and then googling for a strange exception error message and going through all the forums and mail lists trying to figure out what is causing it and how to fix it. Note: in this particular scenario I am only interested in forums and sites dealing with java and with specific library (or even version of the library) I am having a problem with. As a side note, it is after repeatedly doing these steps I decided to create www.brainleg.com 
  • researching a doctor I am going to visit by checking him on all medical sites (note: in this scenario when I am looking for ‘Bob Smith’ MD I do not want to see any managing directors and I am interested in ‘Bob Smith’ only in my town and state, not some guys across the country) and then when I think he is okay I need to find directions to his office, check office hours, add him and his address to my contacts, and print directions just in case.
By the way, attempt of search engines to be smart and return relevant results (when you search ‘pizza Portland’ you will get different localized results depending on where you are - Portland Oregon or Portland Maine) stems from the attempt to deduce the end user’s scenario based on what he types and his previous searches (a practice which duck duck go calls google’s bubble), and currently the only way for the end user to communicate his scenario is via additional search keywords, which most of time fails miserably leading to lots of clicking on the results; even when search engine guesses my scenario correctly, the output is just paginated search result links.

We need a new interactivity paradigm (let’s code name it ‘Stream’) which will shift the interaction model from being application action-centered (clicking on the UI or typing search queries) to user scenario-centered by interlinking devices, applications and websites.

In a nutshell, Stream will consist of two things:
  1. Human communication API: a very rapid request/response cycle as close to the speed of thought as possible
  2. Abstraction API: ability to expose a high-level scenarios to the user and/or other Stream device/application/websites
Let’s review each item in more detail.

Human Communication API
drives my ability to execute a mental request which the target device supporting Stream interface (computer or my home phone or my bedroom AC) would somehow recognize. The implementation of recognition can be:
  • reading my facial expression (via high resolution webcam)
  • reading my hand/finger gestures (via high resolution webcam or my cell phone’s camera looking at my hands)
  • reading my tactile gestures
  • voice commands
  • reading movements of the eyes and correlating these movements to a pixel-precision location on the computer display (let’s say there is a new type of icon on a desktop - a ‘visiicon’ - something which triggers a command when I look at it for more than 1.5 second)
  • ideal solution - real brain-computer interface (hopefully the non-invasive kind)

Abstraction API:  Each device/application/website must provide a new kind of api which would bridge scenarios (what is exposed to the outside) and usecases, considerations, and sequence of actions (internal implementation) on the device/application/website that must be taken to accomplish the abstraction scenario.

Stream API is not an API in a typical sense - instead of exposing a bunch of low-level methods like ‘Email[] checkEmail()’ the goal is to tell other Stream devices that ‘I have a ‘check-email’ scenario you can use, and internally I will figure out if I need extra input from the user via the Human Communication API and these are the possible outcomes of this scenario’. This way the creator of the device/application/website (who generally has the best insight into the possible usecases and capabilities) puts together and exposes not just atomic actions but a thought-through implemented scenarios which the Stream can take and communicate with the user via communication module (if user input is needed) and with other Stream devices.

Having a standard API will allow competition among developers to provide more than one kind of gateway to the rest of Stream ecosystem - some might develop a new Siri-like voice interface, others might provide smart free-text entry desktop gadgets which will try to figure out user scenarios on the fly, confirm them, and execute them, and others will provide visiicons and 3d gestures support.

I am really waiting for the day when we can raise user experience for typical scenarios to a new level where we are communicating on the abstractions (scenarios) rather than on a series of button and menu clicks with isolated app/device/websites.

What do you think?

Friday, November 30, 2012

O Brother, Where Art Thou? BrainLeg: Geographical Java Developers Distribution



In this post I will present some statistics of geographical java developers distribution based on the usage of http://www.brainleg.com - Structural Java Exception Search Engine. Exception troubleshooting is one of the most frequent tasks every java developer faces and so BrainLeg website visits should be roughly representative of the overall java development activities on the planet.

Do you ever wonder, of all the countries in the world, which country uses java the most? What are the major java city-hubs?

As of Nov  2012, BrainLeg has about 240,000 java exceptions in its database coming from various sources - StackOverflow, various mail lists, tons of java-related forums etc. One point must be discussed before we jump to the statistics - selection bias - coverage of java technologies indexed. If a certain technology is not indexed google wouldn’t drive traffic for its exceptions to BrainLeg, and these java developers would be missing from the stats.

Having spent a lot of time on finding sources of java exceptions, I can confidently say that StackOverflow is slowly taking over the world in terms of amount of exceptions posted there as well as the breadth of topics covered. The daily average for most java-related forums is 1-3 posted exceptions, whereas StackOverflow manages to post 10 exceptions on any given Sunday, doubling and tripling that number on a weekday. And since SO is present in the dataset it should provide at least some guarantee that a variety of java technologists are included in the statistics.

And, one last note: the numbers below should be taken with some threshold, especially when comparing one to another. I tend to think that at least 30% margin should be applied when comparing any two numbers or percentages given below, to account for various sampling errors.

Ok, let’s take a look at the numbers. All statistics below cover time period Sep 1-Nov 24

Figure 1. Distribution of java developers by country:

 Note: Germany has an inflated number (by about 50%) because it includes website monitoring hits from a monitoring service location in Germany. Unfortunately it was too problematic to exclude them because over time they change agent identification as well as monitoring location IPs.

So US is leading in java developers, followed by India, and Germany/China having about the same number of java developers.

Figure 2. Distribution of java developers by US state:

As you can see, in the US the most active java state is California followed by New York state and then Texas.

Let’s take a look at java City-hubs:

Figure 3. Distribution of java developers by Cities:

A completely different picture emerges. 4 out of top 5 java hubs are located in India. And only one US city is included in the top 25 - New York City (And for some reason below Paris! But being a New Yorker I am certainly going to refer back to my note about 30% margin of error and dismiss it)

Wait, where is California?

In Figure 2, compared to New York state, California has more than two times number of java developers looking for exceptions, where are they city-wise? Drilling down into California we see the following stats which explains why none of the California cities is in the top 25 java city-hubs - most of them are quite close to each other but are spread out geographically.

Conclusion

I don’t think a city-level comparison should be taken too seriously - as demonstrated by previous table it is more about clustering than specific cities. However, I was shocked to discover just how dense India’s java development is (Bangalore having about the same number of java developers as the whole California and two times more than New York City). Someone will undoubtedly soon comment about quality of java developers, and yes this is something that does matter a lot and yet I believe that people do get better, and today’s junior developer is tomorrow’s senior developer.

Country-wise US is still leading but it is great to see other countries all over the world very actively using java! It would be really interesting to compare java to other languages but at the moment http://www.brainleg.com handles only java exceptions, so I can’t speculate about Ruby for example.

Thank you for reading, appreciate any comments and feedback!

Friday, November 16, 2012

Your Own Unix Box as a Cheap Alternative To AWS Beanstalk



(Shhh, I run my WAR file on AWS Beanstalk and on a Unix box under my table)

In this article I will cover some simple steps to migrate from AWS Beanstalk environment to your own unix server, and be able to revert back to AWS Beanstalk in less than 1 minute if needed.

Four years ago I purchased a quad core box to run my jira, confluence, and cvs. I now use free bitbucket Git repo with ticketing system, and use Google docs instead of confluence. I had been keeping my old server down for last 3 months. Recently I was analyzing one of my quick projects hosted on AWS Beanstalk and started thinking about the costs of running it on Beanstalk and how they can be optimized.

The biggest benefits of Beanstalk from the point of view of software developer are extremely simple deployment model (just upload the latest .war file) and possibility of elastically adding new instances if the traffic goes up. Yet, for organic growth sites in their infancy the steep traffic spikes are more of a big hairy audacious goal rather than the everyday reality - you might have your server running on c1.medium instance for several months and see very little end user traffic, yet paying for 24x7 usage of the server (running on t1.micro is not really an option if you use java and have CPU spikes because such spikes tend to kill micro instances on the spot)

I set myself a goal: in 10 hours investigate how I can host my AWS Beanstalk app locally with the following conditions:
  1. it should be extremely simple to revert back to AWS Beanstalk hosting if the traffic does go up, or if my local box is experiencing problems (a hardware problem, my internet connection connectivity etc).
  2. the same war file should be runnable on both local and AWS Beanstalk environments
  3. my locally hosted site should support SSL
  4. it should be very simple to set things up

How my project is currently setup on AWS

  • There are two subdomains:
    • www. subdomain is used to host purely static content such as home page, all images, stylesheets etc. This is served by cloudfront, backed by S3 (DNS record for this subdomain is a CNAME pointing to cloudfront distribution URL). Whenever static html needs to point to dynamic app functionality ('Register', 'Login' links etc.) it points to www1 subdomain where dynamic app is running. If I manage to keep the links to www1 subdomain exactly as they are I do not need to change anything on the static site.
    • www1. subdomain points to AWS Beanstalk environment (CNAME record).
  • I use SimpleDB, SES, and S3 services which means all connectivity to these services is done via AWS SDK API and is deployment location-agnostic (it will run slower of course from my home than from AWS datacenter but it is still fast enough). If you use traditional SQL server on AWS you might have to modify firewall rules on these servers to allow connectivity from your public IP.
So the goal is to be able to point my www1 subdomain to my local server when I run the app locally and flip it back to AWS Beanstalk url when I decide to run it again on Beanstalk


Configuring unix box

Instead of reconfiguring my old unix server I decided to install a fresh CentOS on that box and configure the system from scratch.
  1. Downloaded latest CentOS (at the time of this writing 6.3) .iso image and burned a DVD. Ran liveCD and then clicked ‘install to hard drive’ icon on desktop. During installation I named the non-root user who would be using the system 'developer'.
  2. Added developer to wheel group:
     sudo su -  
     usermod -a -G wheel developer  
    
  3. add wheel to sudoers: as root run ‘visudo’ and uncomment this line:
     %wheel ALL=(ALL)    NOPASSWD: ALL  
    
  4. Downloaded and installed 32-bit java. Quick note why I chose 32-bit java - on AWS I am running on c1.medium instance which means it is 32-bit system with 32-bit java JVM. Even though my local box has much more memory I wanted to have exactly the same memory model as on AWS so that when I switch back to AWS the app would run the same. Detailed installation instructions for can be found here. I installed jdk 1.6.0_37 to /usr/java and after that made ‘developer’ user owner of that dir:
       cd /usr  
       chown -R developer java  
    

Application setup

  1. All my application-related files and apps will be located under directory I called ‘/company’. As root execute the following commands:
     cd /  
     mkdir company  
     chown -R developer company  
    
  2. Become user ‘developer’ and download and unpack latest tomcat (at the time of the writing version 7.0.32) to /company/apache-tomcat-7.0.32. Detailed instructions can be found here. Please note that I customized service file a little bit: a)‘tomcat’ service will run tomcat as ‘developer’ user and will run it from tomcat bin folder - if the app tries to create dummy files it will be doing in the directory is has write access to, b) changed a bit how tomcat is started/shut down to work with ssh invocation, for some reason original version was giving me trouble when invoked remotely. As root:
     cd /etc/init.d  
     vi tomcat  
    
    then provide the following text:
     #!/bin/bash  
     # processname: tomcat  
     # chkconfig: 234 20 80  
     JAVA_HOME=/usr/java/latest  
     export JAVA_HOME  
     PATH=$JAVA_HOME/bin:$PATH  
     export PATH  
     CATALINA_HOME=/company/apache-tomcat-7.0.32  
     case $1 in  
     start)  
     cd $CATALINA_HOME  
     /bin/su -c "$CATALINA_HOME/bin/startup.sh" developer  
     echo "done start"  
     ;;  
     stop)  
     cd $CATALINA_HOME  
     /bin/su -c "$CATALINA_HOME/bin/shutdown.sh" developer  
     echo "done stop"  
     ;;  
     restart)  
     cd $CATALINA_HOME  
     /bin/su -c "$CATALINA_HOME/bin/shutdown.sh" developer  
     /bin/su -c "$CATALINA_HOME/bin/startup.sh" developer  
     echo "done restart"  
     ;;  
     esac  
     exit 0  
    
  3. Specify appropriate memory size in /company/apache-tomcat-7.0.32/bin/catalina.sh file. If you use Grails make sure you provide enough MaxPermSize as well:
     JAVA_OPTS="-XX:MaxPermSize=160m -Xms1400m -Xmx1400m"  
    
  4. The rest of service setup is as per tomcat installation article mentioned above, after you are done make sure you can startup and shut down tomcat using
     service tomcat start  
     service tomcat stop  
    
  5. Reboot your unix server to make sure tomcat starts up correctly after reboot. Tip: if it doesn’t startup check that ‘tomcat’ service is enabled in System->Administration->Services dialog.
If all is well you will be able to see default tomcat page at http://localhost:8080

Now delete all apps and their exploded dirs from /company/apache-tomcat-7.0.32/webapps, we will start deploying our own

Deploying your app


Tip: during testing avoid deploying your production war file. Instead, create a dummy web app with dynamic ‘hello word’+new java.util.Date() message on some dummy servlet url (hopefully you are not really writing servlets in 2012 and use something cool like Grails)

We will develope a simple Ant script to push the war file to the unix box with the tomcat, but first we have to enable SSH on your unix box:
  • in System->Administration->Firewall make sure port 22 is enabled
  • in System->Administration->Services make sure ‘sshd’ service is enabled and running (OpenSSH server)
Now create /company/deploy-temp directory on the unix server and ant build.xml file on the server where you build your war file:
 <project name="build" default="push-war" basedir=".">  
   <property name="host" value="unixhostnamehere-changeme"/>  
   <property name="user" value="developer"/>  
   <property name="pass" value="passwordhere-changeme"/>  
   <property name="remoteTempDir" value="/company/deploy-temp"/>  
   <property name="tomcat" value="/company/apache-tomcat-7.0.32”/>  
   <property name="originalWarName" value="myapp-0.1.war"/>  
   <property name="originalWarFullPath" value="C://Source/myapp/target/${originalWarName}"/>  
   <target name="push-war">  
     <echo message="copying war file..."/>  
     <scp file="${originalWarFullPath}" todir="${user}@${host}:${remoteTempDir}"  
        password="${pass}"/>  
     <!--shut down tomcat -->  
     <sshexec command="sudo /sbin/service tomcat stop"  
          host="${host}" username="${user}" password="${pass}" usepty="true"/>  
     <!--rename war file to ROOT.war and move to webapps dir-->  
     <sshexec command="mv ${remoteTempDir}/${originalWarName} ${tomcat}/webapps/ROOT.war"  
          host="${host}" username="${user}" password="${pass}"/>  
     <!--delete old exploded version in webapps dir-->  
     <sshexec command="rm -Rf ${tomcat}/webapps/ROOT"  
          host="${host}" username="${user}" password="${pass}"/>  
     <echo message="making a small pause.."/>  
     <sleep seconds="5"/>  
     <!--start up tomcat -->  
     <sshexec command="sudo /sbin/service tomcat start"  
          host="${host}" username="${user}" password="${pass}" usepty="true"/>  
   </target>  
 </project>  

(scp and sshexec tasks require ajsch-0.1.49.jar (I listed the lastest version as of Nov 2012) in apache-ant-1.8.4\lib directory - download from http://www.jcraft.com/jsch/)

Run the script. It should deploy your test web app and you should be able to see it by using either your unix host name (if you entered it in your local hosts file) or its ip on port 8080

DNS Records Change


Now go to your domain DNS records (on your domain registrar’s website):
  1. copy current value of CNAME for www1 to a new dummy subdomain www9 (just to have the URL handy when you need to flip back to AWS beanstalk)
  2. change www1 from CNAME to A record and specify your public IP address (google ‘what is my ip address’) - make sure you specify low enough TTL to force the change asap when you need to flip it back.

Configuring Router Ports


Go to your router configuration and enable port redirection for ports 80 and 443:
  • router port 80 should forward to your unix box IP on port 8080
  • router port 443 should forward to your unix box IP on port 8443

Enabling SSL


I am assuming you already have a running Beanstalk environment with SSL which means you already have your private key and had issued your certificate in apache format.
  1. Create /company/ssl directory and copy the following files to it (I am using Namecheap.com files as an example, first two files are provided by your SSL issuer to establish the certificate chain, the third file is your private key, and the last one is the Apache certificate you issued and used in beanstalk):
    1. AddTrustExternalCARoot.crt
    2. PositiveSSLCA2.crt
    3. private-key.pem
    4. www1_mysite_com.crt
  2. in that dir run:
     cat www1_mysite_com.crt AddTrustExternalCARoot.crt PositiveSSLCA2.crt > chain.crt  
     openssl pkcs12 -export -in www1_mysite_com.crt -inkey private-key.pem -out www1_mysite_com.p12 -name tomcat -CAfile chain.crt -caname root -chain  
    
    when asked for password type your desired password (‘passw-changeme’)
  3. enter in conf/server.xml in tomcat config:
     <Connector  
          protocol="HTTP/1.1"  
          port="8443" maxThreads="200"  
          scheme="https" secure="true" SSLEnabled="true"  
          keystoreFile="/company/ssl/www1_mysite_com.p12" keystorePass="passw-changeme"  
          keystoreType="pkcs12"  
          clientAuth="false" sslProtocol="TLS"  
     />  
    
    also change in that file all host names of ‘localhost’ to ‘www1.mysite.com’ (use appropriate host for which you issued your SSL)
Now restart tomcat and hit your https://www1.mysite.com url (change mysite to your real domain) - you should be able to see the dummy web app’s page working and verify SSL certificate.

Deploying the real app


Most likely, your real web app has expectations of some paths (for example, my beanstalk app writes its log files to /opt/tomcat7/logs - this is the tomcat installation directory on beanstalk AMI). The easiest way is to create the same directories or symlinks with permission given to ‘developer’ unix user on your local unix box to mimic directories your app expects to work with on Beanstalk. This way you can use the same war file for both deployments.

Now modify ant script to use your real war file and give it a spin!

Reverting back to AWS Beanstalk

  1. shut down your local unix tomcat environment:
     service tomcat stop  
    
  2. Go to your domain DNS records (on your domain registrar’s website) and make www1 subdomain a CNAME pointing to your AWS Beanstalk url

Conclusion


Yes, I know it is really not cool to run a prod app from your home. Your IP might change, lights might go out, your spouse might be downloading something big etc. Yet, if it takes only 1 minute to flip back to standard AWS Beanstalk hosting (simple enough to do from the phone if your internet is down) the downside is not that horrible, and the cost savings are real.

Thank you for reading, feel free to ask any questions, and if you have any tips for making it easier please share them!

Thursday, October 4, 2012

Minimum Viable Product in 10 Evenings - InstanceVibe.com



On Sep 12, 2012 Amazon announced AWS Reserved Instance Marketplace. Since I use Reserved Instances (RI) for one my projects (BrainLeg - structural java exception search engine), I became immediately interested.

Unfortunately, in its current shape RI Marketplace only provides listing of current offerings plus Buy Now button. For most people it would be much more convenient to create an alert for some instance criteria (region, availability zones, instance types, products, minimum quantities, minimum and/or maximum term, utilization) and be notified when such an instance becomes listed on the marketplace.

I decided I would dedicate 10 evenings to design, develop, and launch a minimum viable product. This is how it went.

Evening 1 - Running around the apartment trying to put my thoughts together


Evening 2 - Feature Design


Feature 1 - ability to set alerts to be notified about new listings. Some questions I went through:
  • how will they be notified? Email for now
  • how often will email be sent, daily summary or as it happens? For now as it happens
  • how often can I scan marketplace? TBD tomorrow
  • what should be in the notification? Delta from the previous notification. Okay, delta, but what should be included? Fixed Price, Hourly Price, and pre-calculated Total Cost of Ownership (TCO) for each term (3m, 6m, 9m, 12m, 18m, 24m, 30m, 36m)
  • is it a paid feature? For the cheapest instance type (t1.micro) is should be free so that users can safely try it out. For all other instances it should be a paid feature. Ok, what about price levels and duration? For now keep it simple, same price for all unlimited alerts, differentiate by duration - 2 weeks or 4 weeks.

Feature 2 - ability to see analytics for RI marketplace. I would like to buy an instance for 8 months. When I see a listing with 9 months remaining, how do I know whether to buy or not?
  • what is the liquidity of the marketplace? how often do instances of this type in this region with this OS get listed and how often then get sold? I will need a historical chart listing instance listed/sold for each combination of (instance type, region, platform)
  • what were historical prices for sold instances? I need a historical chart breaking down for the past N days for each typical term (3m, 6m...36m) the best/worst TCO I would have achieved if I bought on that day.
  • Paid? no, free for all

Evening 3 - Viability Test


AWS SDK does not provide a hook into marketplace at the moment. How can I get current listings? I will need to emulate myself using a browser, logging in and then clicking ‘Purchase Reserved Instances’ button.

I used FireBug to trace which AJAX calls were being made by AWS Console’s JavaScript - it was getting a nice JSON object with the current listings. Next step was to quickly hack something to check if I can emulate the end user. I used html unit to login, then I collected all the cookies, and then executed the actual POST request with the cookies using http client to get the JSON. It worked. I then knew I can get the data I needed.

Evening 4 - Architecture Design


Which web framework should I use to implement my product? Which database? There is a saying among photographers ‘The Best Camera is the One That's With You’. I know Grails quite well and I love it. Plus I already wrote Grails GORM adapter for AWS SimpleDB. Why SimpleDB? The hosting costs of running dedicated AWS RDS instance would be too high. I need something cheap, and SimpleDB is perfect for it. Later I can move to DynamoDB if needed.

Ok, grails. Should everything be dynamically served by grails? No, it might cause too much CPU load and I will need more AWS instances, thus all analytics and home page should be pre-generated by the grails server using quartz plugin as static html files and images, uploaded to S3, and then fronted by CloudFront for caching optimizations.

What about the CSS? Bootstrap rocks.

What about the logo and favicon? I posted a job on 48hourslogo.com and for $100 got a very nice logo.

What about hosting? I like namecheap.com. Got 5 year domain with 3year SSL plus privacy for $83

What about handling credit cards? Stripe is absolutely a joy to work with, it took me 2 hours from registering Stripe account to having a dummy page processing test credit cards.

Evening 5,6,7,8 - Coding


Data Model:
  • Offering - represents an offering in the RI Marketplace
  • Scan - represents a scan of RI Marketplace. A Scan creates offerings if they do not yet exist
  • Stat - stores the statistics for a given timestamp (number of instances sold/listed, min/max TCO prices). They are used to build charts.
  • User - end user, linked to spring security plugin
  • Alert - alert configuration for a given user (region/instance type etc)
  • Tx - record of financial transaction for charging/refunding users and for accounting

As usual, back-end model and the analytics engine took about 35% and the rest was spent mostly on UI and testing.

Evening 9 - Deployment


I deployed on AWS Beanstalk on c1.medium instance. It provides only 1.7 GB of memory but runs very fast. For now I launched as on demand instance, will create alert and find RI of my dream using my own product!

Evening 10 - PROD Testing


Created c1.medium alert on production, waiting for my perfect instance. You can check latest analytics for c1.medium

Conclusion


By working within severe time constraints I had to really focus on making things work well and … making as few things as possible. At the end of this experiment I was completely exhausted and it took me 2 days to recover. However it was absolutely the best project ever in terms of fun and personal satisfaction.

The end result: www.instancevibe.com

None of this would be possible without the open source. Thank you Grails team, all grails plugin developers, JFreeChart, and bootstrap team.

If you have any questions about particulars of architecture, deployment etc. I would be happy to answer. Ideas and suggestions are really appreciated as well, thank you for reading!

Wednesday, August 24, 2011

Your last two weeks at a job - some do’s and don’ts


This post is for those who are leaving their current company on good terms (see my previous post about involuntary separation).

So you’ve given your two weeks notice, everyone in the office knows that you are leaving, and you feel ready, perhaps even giddy to move on. In a sense, you’ve already said your goodbye to the company, even if only in your own mind. Now is the time to take it easy, right?

What not to do


There is really only one item in this category. Don’t ruin your reputation.

I see this all the time. The first thing that happens is the soon-to-be ex-employee starts to come to the office late, and then to leave at 5 pm on the dot. Then it becomes quite obvious that she/he couldn't care less about the team’s next release, or about much else that needs to get done before their last day with the company actually arrives. The person simply stops caring, they are no longer emotionally or otherwise invested.

Is this natural? Of course it is! After all, you’ve worked hard for the company, probably haven’t gotten all you deserved in return, and what could they do anyway - fire you?

The problem with this behavior is that in the eyes of your soon-to-be former co-workers and your soon- to-be ex-boss you effectively negate all of your previous achievements, and as a result leaving everyone with what memories of you? Exactly - lazy, unenthusiastic, self-centered.

Any industry is a fairly small world - you might meet these people again, and sooner than you think. Don’t let them remember you as that jerk who left at 5pm when everyone worked till 8:30pm, took three hour lunches, spilled all the gossip, and stole the good pens. Don’t leave a bad last impression.

What to do during your last two weeks


The best way to spend your last two weeks is making sure everyone who ends up picking up your work and responsibilities knows what to do and are up to date on what you’ve done and why. In other words, instead of trying to implement a cool new feature you should spend your time cleaning up loose ends and meticulously documenting your work.

Document everything you think would be helpful to you if you had to fill in for a guy who has gone. Don’t write things like ‘install PGP and create keyrings and pairs of keys’ - this is not really helpful; instead create a step-by-step guide with screenshots, it takes 15 minutes of extra but people will thank you for it long after you have left the company.

Another important tip is to create a sense of continuity for the people you’ve been working with and for your projects. The best course of action is to redirect new problems and requests to the person who is going to be taking over your responsibilities while you spend your remaining time documenting and cleaning up. This allows your replacement to get a real feel for the job while you are still there to answer their questions - they will be much better prepared to take over for you. However, if they do run into issues, don’t solve the problem yourself - this will not help them deal with things when you are gone. Instead explain the issue and sit with them while they are fixing it.

Very good advice from Brian Henerey (http://www.opsly.me/2011/08/knowledge-transfer-devops-style.html)  is to train two people when you are leaving. This way the team eliminates a single point of failure in case the person who is replacing you decides to resign in two months.

Your reputation is still your biggest asset. Your last two weeks is the time in which you can either destroy or strengthen it. Do the right thing.

Friday, July 8, 2011

Never use the company’s name in your naming convention


I have made a non-negotiable commitment to myself - never again will I put a company’s name (let’s say it’s company XYZ) in any of the items below:
  • java package name (com.xyz....)
  • class or script name (XYZStoredProcedure.java)
  • database table name (XYZ_DOCUMENTS)
  • host name (prod1.xyz-internal.com)
The reason is very simple - things change as the companies or their divisions get sold/acquired/merged into other companies, and nothing looks funnier than explaining to a new team member ‘well, all our packages start with ‘com.xyz’ for historical reasons because XYZ got sold to our new company ABC and it would take too much effort to refactor and then re-test everything’.

What naming convention should you use? Use something generic like ‘App’ (com.app... ,  AppStoredProcedure etc.) and/or something neutral not related to a specific project like ‘Carbon’ or ‘Star’ (so you can have com.app.carbon... or com.carbon...).

Do not worry about java package clash.
In my entire professional experience I am yet to find a third-party library which used the same naming in their package name as I use in mine - most of the open source projects are well-named to prevent package clashes with your code.

So when you start from scratch at your new company do yourself a favor and don’t use company name anywhere no matter how confident you are that ‘this time it will be permanent’. You will thank yourself for that decision in a couple of years.

Tuesday, June 28, 2011

An Articulate Idiot – The Force Behind Many Failures


“The really dangerous people believe that they are doing whatever they are doing solely and only because it is without question the right thing to do. And this is what makes them dangerous” (Neil Gaiman, “American Gods”)

First, a definition. An Articulate Idiot is not an idiot in an intellectual sense at all: he/she is a fairly smart person with unshakable convictions who is wrong in a fundamental way but frequently uses his/her outstanding debate skills to beat anyone who raises questions or challenges him/her into submission.

Articulate idiots are present in most organizations and on all levels. The damage they cause is directly proportional to where they stand on the corporate ladder.

On the most benign level articulate idiots are responsible for horrible user interface designs. Think about some irritating user interface feature in some product - my pet peeve used to be an inability to get rid of default grouping by conversation in Gmail's inbox (it took Google 4 years to allow people to change that option), IPhone's lack of standard 'menu' hardware button causing screen real estate waste, Microsoft's idea of Outlook's configuration menu consisting of nightmarish 3-level nesting of modal pop-up dialogs (have they ever tried adding a new mailbox?) etc.

The reason for all these awkward user interface bobbles is always the same – there is an articulate idiot in each of the above mentioned organizations who either has direct control (head of the UI team, for example) or who passionately kicks the #@#$ out from everyone suggesting some changes in front of the management. The person will come up with 1,000 reasons for why things should be what he/she wants them to be. In the end everyone around him/her behaves as in an old Russian proverb: 'sometimes it’s simpler to sleep with someone than to explain that you are not in the mood'.

Yet the real problems start when there is an articulate idiot higher up in the organization. I would venture to claim that most of failed M&As (merger and acquisitions) and most of the fatal strategic errors are caused by an articulate idiot creating a rosy picture and very logically defending it against critique - they manage to bring up just the right statistics and reports and white papers, they cite the leading experts, they point to numerous failed examples of those who tried to do the ‘wrong’ thing, they refer to the successful implementation by them at the previous place, and the most horrible thing - they are masters of powerpoint presentations where in six to eight irresistible bullet points they make a solid case proving that theirs is the only correct solution to the problem.

You might be arguing that true leaders and visionaries also have strong convictions and point to some examples when a charismatic CEO changed the strategic direction of the company and succeeded when no one agreed with him.

The difference between articulate idiots and leaders/visionaries is very subtle and boils down to one thing – both leaders and visionaries have the courage to consider a possibility of not being right or change some of their assumptions as time goes by. Articulate idiots are never wrong in their own eyes and all their assumptions are always true.

So how can an organization protect itself from articulate idiots? Unfortunately it is not something that can be dealt with by coworkers - it’s the responsibility of  management on all tiers to differentiate two simple things: 'smartness & articulation' vs. 'common sense'. It takes skills and experience for a manager to recognize the articulate idiot syndrome in a person and to evaluate that person’s ideas based on common sense first.

Sunday, May 29, 2011

How To Properly Terminate Employees


This post is about a topic which is decidedly unpleasant – the termination of your employees.

I will leave the 'why' out; all companies hire people and all companies fire people. In the end, you as the head of IT have to handle the termination and make sure it is done properly.

First, let's differentiate two important cases of termination:
  • voluntary termination
  • involuntary termination

Voluntary Termination


Generally this occurs when an employee resigns. After the resignation conversation with the manager takes place, there are usually two possible outcomes: an amicable separation or an ugly one.

In an amicable separation the soon to be former employee will offer his employer a two or three week grace period to help with the transition. This means that the person remains a real team member who retains full access while actively helping the rest of the team to gradually take over his responsibilities. In an amicable separation the termination tasks are left till the last day of employment. You close down all of his or her accounts and access points on all systems, and then you take that person out for drinks.

In an ugly separation (i.e. one day the employee throws the building security card on the manager's desk and proceeds to tell everyone what he or she thinks of them) see "Involuntary termination" because the actions you will have to take are exactly the same, and because you will have to perform them immediately.

Involuntary Termination


Involuntary termination is tough on everybody. Unless the person you about to fire is an asshole extraordinaire, you and your colleagues will naturally feel a lot of sympathy for him. Knowing that the person will be terminated a week or two, or sometimes more ahead of time makes it even worse. You may experience feelings of guilt and even shame for having to let him go and keeping him unsuspecting and unprepared. However, do not let the employee know. If you do, you run the risk of undermining the rest of the team and your professional reputation. And I'll tell you why.

Let's go step by step through the involuntary termination process gone wrong:

  1. The manager calls the employee into his office.
  2. The manager (and sometimes a company lawyer) spend 15-90 minutes with the person who is being fired.
  3. The person comes out from the office, logs into his computer (or some other computer on the network) and does something stupid such as:
    • posts customer list/source code/passwords on the internet
    • formats his hard drive
    • sends out some sort of email to all the customers
    • changes critical passwords in production systems
    • commits an 'easter egg' piece of code into source code repository

The important thing to note is that the person might not really mean to do something malicious. For example, I knew a person who after being fired went to his computer and cleared out his 'My Documents' folder thinking that it would simplify life for IT personnel. Unfortunately the person didn't realize that he deleted some critical documents which he had saved locally (this is where daily backups come to the rescue but that's a topic for another post).

What it boils down to is a very simply rule: in case of involuntary termination the moment the person enters the manager's office you have to immediately terminate his/her access to all systems, including physical access to desktops or laptops. Once you disable his/her Active Directory account, restart his/her computer and either physically take it away or don't let the person touch it.

It sounds very harsh and impolite, it makes everyone feel bad, but trust me, by eliminating all access points immediately, you save the person from possibly making the biggest mistake in his/her career at a moment when they may feel most vulnerable, humiliated and hurt. You also save your company from potentially major problems.

Some practical tips


Have a well-defined list of termination tasks that should be done for each type of employee, and split the tasks among appropriate people. If you have some type of issue tracking system, enter the tasks as a template there once and then just 'clone' these tasks and assign to people. If you do not have an automated system (you should), print out tasks and make people give you signed copy back when they are done.

All employees:
  • disable user accounts in Active Directory/unix servers
  • disable VPN and email access
  • disable access to all shared drives (none should be allowing 'guest' users)
  • disable access to all external systems such as WebEx, Salesforce etc
  • disable access to all production and development environments (typically web-based accounts)
  • disable building and office entry access

IT employees:
  • disable access to all production (database, web, monitoring systems) and development systems
  • disable access to all source code repositories
  • change passwords on all critical infrastructure components (routers, switches, ILO, root passwords etc)
  • disable access to server rooms



I know this is a painful topic, please share your experiences and tips!

Sunday, April 17, 2011

Encryption of passwords for personal and work use


My previous post on the topic of privacy and security was of a more theoretical nature. This post is about a topic applicable to all of us - how to keep, protect, and manage your personal and work passwords.

First, a comment on a notion that I often hear repeated - that there is no need to actually store your passwords because if you are smart enough and use some kind of pattern to create passwords then the only thing you need to remember is that pattern. For example, let’s say you add a prefix and/or suffix to the company name (which is what many people do) - so for www.amazon.com you use ‘my Amazon Password’ and for www.google.com you use ‘my Google Password’. This is not a good idea, I’ll tell you why.

Unfortunately many websites store passwords as plain text in their database (instead of computing a one-way code of the password and storing only that hash). So if someone breaks into one of these sites, where, let’s say, you registered for a one-time purchase 2 years ago, they can easily figure out your passwords for other websites.

Unless you have a great memory you do need a secure place to store your passwords. Some people write them down on scraps of paper which they then lose, some keep notebooks, some use encrypted zip files. This is all fine (although I wouldn’t go with option #1) if you have only one computer or use your passwords from only one location.

Similar problems arise at work - in each IT organization there should be more than one person who has full access to all the root passwords to the servers. For good personnel redundancy it’s worthwhile to share them among at least two system administrators and their manager.

For both home and work I use a free open-source tool called KeePass. It works very well - I can generate complex passwords, copy the passwords to clipboard, quickly find all relevant entries by typing some part of the entry name etc.

One word of caution about synchronization of the KeePass database (which is a single small file). The built-in synchronization only works for a shared file. If you have more than one computer from which you need to access/create passwords then all of these computers must have access to some centrally shared file on one of your servers to read and update passwords. The program then manages updates from different computers by merging changes so that they don’t overwrite each other.

For both work and home use I prefer another kind of solution which is achievable with a free KeePass plug-in called KeePassSync. This plug-in allows you to work with your local password database and keep it synchronized with a securely stored central password database located on a unix server (via PSCP) or on amazon S3. I.e. you can have password database on your laptop and on each relevant computer and use without being on your work network or even on the Internet. For work I use a local unix server, and at home I use S3 to store the central copy of the database.

KeePass also allows you to attach files to database entries, so it’s perfect for storing private SSH keys by attaching them to entries for appropriate servers. This way the log-in and the key file is stored in the same place, securely. You can attach text notes as well - very handy for storing all challenge questions for your financial institution accounts.

One word of caution - always back up your password database file to an off disk location regularly, i.e. burn it to a CD every couple of months.









Saturday, April 9, 2011

Gut Feeling - Project Planning and Management Part I


If I were to summarize my opinion about project planning into one sentence it would be this - trust your gut feeling no matter what all the traditional project planning techniques tell you.

The mistakes in project planning mostly happen because of:
  • Failure to understand the complexity of the business logic
  • Failure to understand the perimeter increase
  • Failure to contain ‘digital growth’
Let’s talk about them first.

Mistake 1. Failure to understand business logic complexity


Imagine that you have to implement a page with three input fields and a save button. How long would it take to develop? The answer is - it is a fixed cost for the database-related work but a variable cost depending on the what the logic around these fields is. And it is the variable cost which kills most of the initial estimates.

Case 1: You are implementing a user details page for a dating application and the user enters his First Name, Middle Name, and the Last Name. So the page looks like this:



Database-related work is mostly trivial, and the business logic here would just be ensuring that the user does not enter very long names (no longer than 100 characters) and that the First name is mandatory (i.e. at least 2 characters). So there is some business logic focused mostly on validation but it is not that much work.

Case 2: You are implementing a portfolio allocation page. That page shows your current portfolio value (let’s say $1,000,000) and allows the user to specify how this should be split between stocks, bonds, and cash. Once again the screen looks very similar to the first case, i.e. it consist of three entry fields and a Submit button:



However, the business logic is more complex here because you want to give  the user an ability to specify either percentages or actual amounts, or both. So your rules become:
  • if all values are percentages, i.e. contain symbol ‘%’ then you have to make sure that each value is a valid positive number and that all of them sum up to 100.
  • if all values are actual amounts, you have to ensure that all are valid positive values and sum up to the portfolio value of $1,000,000
  • if the user mixes the percentages and the actual values, the logic of the system should be: subtract the actual values from the portfolio value ($1,000,000) and distribute the remaining amount by the specified percentages. Which means if the user enters two percentage fields, they must add up to 100 (we distribute the remainder between these two fields), and if he enters only one percentage field he better enter 100%.

So suddenly despite the same effort on the database, there is a bigger effort on the business logic.

Case 2 of course is not a very good user interface design in the first place, but I made it up to illustrate the point that looking at ‘what’ you have to show or save (‘just 3 fields’) leaves out a much more riskier portion of ‘how do you handle’ that which you are planning to show or save. The difference between case 1 and case 2 might not be that frightening at first, but image you are talking about 30 pages with 10-20 fields and you are suddenly talking about extra 2 weeks for your team which you did not count on.

Mistake 2. Failure to understand the perimeter increase


This is a different kind of failure. Typically there is one version of the project after another, and each version contains a set of improvements. The problem is that with each new version the size (perimeter) of the project grows and that means two things:
  • each new change is more likely to require some re-work on the previously existing parts of the system. For example, imagine that initial version of our application was based on use Case 1 (three user-entered fields) but the new version of the application adds a fourth field called ‘City I live in’ and now you have to update all the places in the system to show the fourth field as well.
  • your ‘liability luggage’ in increasing - there are more places where things can break after you roll out yet another version in 4 months. Imagine you are under a siege and have 10 warriors to defend a four-bedroom house (very doable) and then start adding 2 rooms every month. When you have 20 rooms suddenly it becomes very tough. And no, you can’t hire another 10 warriors :)

Mistake 3. Failure to contain ‘digital growth’


This is my favorite kind of failure in software development project management. First, what is ‘digital growth’? This is a software equivalent of ‘marine growth’.

Ships become slower when their hulls accumulate barnacles, shellfish, etc. Accumulation of things that ‘should have been done for this release, but we are time-constrained and we will do it in the next release (but when we start coding for next release we are too busy with other things)’ increases long-term fragility of the system.

‘Digital growth’ is barely noticeable in the first 3-5 releases of the system, but it becomes an ever bigger source of project risk if not contained in a timely manner. The only way to contain ‘digital growth’ is to allocate time after each release to clean up the mess before starting on a new big feature.

So how can you estimate what it would take to implement your next version of the software?


What does not work: trying to list out all the small and big to-do-things and features and then summing them up and distributing in some sequence across developers and then adding some time for testing. Why? Because estimating items one by one always excludes all three of the project failure reasons listed above and you always end up with totally unrealistic optimistic estimate,  which your team will suffer for.

What works: over the years I found only one management tool which is reasonably accurate at predicting development and testing effort - your gut feeling. This ‘gut feeling technique’ boils down to one thing - study specification, review the prototype, then think  - what do you ‘feel’ the implementation and testing will take, then multiply by 1.3 and you got your estimate.

So if your gut feeling tells you it will be 2 months for 4 people, you can be quite sure it will really be approximately 2.5 months.

For your next project try doing estimates both ways, write down your estimates and then after you have launched in production, go back to your piece of paper and check - your gut feeling was more accurate because it intuitively accounts for the three dangerous sources of project risk and line-by-line estimates with Gantt charts do not.

This project planning ‘gut feeling’ comes only from direct experience with software development. If you are a software developer most likely your gut feeling is well tuned. If you are a business analyst or a project manager a lot depends on how closely you have worked with development teams before.

Similarly, the further you are from software development the less you should trust your gut feeling when it comes to estimates. Instead you’d better ask a team lead for his gut feeling estimate and trust that.

If you think your gut feeling is not yet well tuned - don’t worry: it is very easy to develop - pick an intense 3-month project, go through all the pain of delivery and ‘why do we still have 73 critical bugs left’ meetings and you will soon be all set for uber project scheduling methodology which is more accurate than everything you have ever heard of before.


Sunday, April 3, 2011

Dell’s idea of warranty for failed HDD


I just learned an interesting business idea: when your customer’ HDD fails, send a refurbished HDD to the customer and make him return the failed one to you. Once you get back the failed HDD do a low-disk format and send it as a replacement for another customer’s failed HDD.

This is what apparently Dell is doing with warranty HDD replacements - just got a replacement HDD which has a nice ‘Refurbished’ logo on it. Tech support told me they always send refurbished hard drives and that I can not get a new one.








Saturday, April 2, 2011

Privacy - Are you still worried about your browser cookies?


Foreword: I am not a conspiracy theory enthusiast. I'd say that on a scale from 0 (don't care about privacy at all) to 10 (“big brother is watching our every step” paranoid) I have to be somewhere around 4 to 5 – an average person who is mindful of privacy.

That being said, have you ever wondered if your Net Behavioral DNA (BDNA) – the way you use internet – can be used to uniquely identify and track you?

Think about it. How consistent are you in your own net use? The searches you make; which results you click on; the sites you visit; when you visit; what kinds of links on the these sites you click; with which intervals – all of these statistics combine to form a unique combination – BDNA – which can identify you no matter where you access the internet from. No need to attach any tracking device to you any more. BDNA is with us all the time.

Is anyone tracking people by BDNA? I hope not. Is is possible? Yes. The accuracy of tracking will be proportionate to the amount of available data, and most of us generate enough of it. Wherever you go, if you use Internet there is an ISP at the end of your internet cable which stores all the stats and the only thing that is needed for identifying you is access to these internet usage stats (and the algorithm which can probably be written in a couple of weeks and reasonably trained as an artificial neural network on a week's worth of stats from about 50 people). I am not going to go into the technical details for cases when you have one IP address used by more then one person (a family or a company office). It is still doable. BDNAs are so unique that put 100 of people together behind one IP and their BDNAs will still look similar to zoo with 100 different animals sitting in the same cage.

How do you protect your BDNA? There is only one way (if you don't count not using internet) – create noise. Yet this is a science of its own because if you just start clicking on random links it will still be a pattern. Perhaps one day there will be tools to protect your BDNA on the net by carefully 'diluting' your browsing habits.

Are you still worried about your browser cookies?