The Smile-IT Blog » Blog Archives

Tag Archives: test

Fruits aren’t necessarily healthy

I converted. Religiously – so to say.

In the 90ies I was into Sinix (anyone still knowing that?); essentially it was Unix anyway – no worries. In 1993 I commenced on Windows. V3.1 I think. C. C++. Basic. Visual Basic – stuff like that. Later MFC, STL, ATL and the ike. Always Windows; nearly solely. For more than 20 years. Attempts to religiously brainwash me away from Unix – however – failed more or less; though I developed a really strong and happy relationship with all Microsoft. Heartily. And not by religion.

I never really had a problem to discuss other personal computing options – especially the fruity ones. The only thing was that the “discusees” in these conversations always tended to claim the predomination of geniusness of their fruit – which always left me a bit suspicious.

And then – finally – at the end of August this year, curiosity beat suspiciousness. And I bought an Apple – not for eating (I’m more the meaty guy ;))

So, here I am – one month later! With reality proving the claims — or not. Here’s my 4 predominant awkward working experiences with a MacBook Pro after the first few weeks of usage:

1. Keyboard

The Mac comes with 4(!) different keyboard overlays; i.e.: one key could theoretically have 4 different effects (keys, shortcuts, functions – you name it). Fine. There is no Pos1/End keys. Still fine (though there would be enough room for adding those left and right form the cursor-up key). Anyway – things become blurring when you try to learn the shortcuts for

  • Start of line, end of line
  • Start of document, end of document
  • Back/forward one word, paragraph, …
  • All that including “selecting text”

Plus: Try those in an editor, then in mail, then in some of the Microsoft Office programs?

Disclaimer: I totally and full heartedly admit that arguing based on third-party apps developed for the Mac is inapplicable here! Hence, please no arguing that it’d be Microsoft’s fault not to adhere to MacBooks’ logic for shortcuts!

However: What IS the logic? And IF there is any: Why is it so fck.gly complicated? Some friends told me prior to converting to the fruit religion that it just takes 2 weeks to accomodate. Sorry folks: I failed that timeline miserably. I just don’t get it. I am open to continued learning: If you can provide a logic for me in this respect, you are tremendously welcome to post your comment here!

Disclaimer 2: In comparison to Apple, those shortcuts are well introduced and generally accepted in Microsoft’s OS and apps world. Pos1/End, selecting text, find, quit, close window, … etc. – all the same shortcut. I didn’t stumble by a single program recently doing it differently.

2. Finder

Did I mention that I love Unix, Linux, … . I always did. I never was an expert really, but I loved the straight-forwardness of that OS and its logic – even though some things just did not work (and some things in the past may have been utterly complicated to get to work). With regards to unix’ logic, Finder presents itself really perfectly in line. Directory structures do remind you to how unix always did it. The sidebar feels as if the important things have been mounted for you already. The user’s working directories all there (is that structuring with “Documents”, “Music”, “Pictures”, etc. actually stolen from Microsoft or the other way round?).

However, Finder starts failing its purpose when it comes to presenting the files contained. I got 4 really smart views (icons, list, a convenient column view and the cover view). But how (the hell) is all this sorted. Alphabetically? Then there’s no way of getting directories to the top. By date? Same problem made worse. Is there a way of setting a preference for the view for all Finder locations? No. Not without tweaking the guts of OS X. Is there a way of quickly resetting the view within one location? Well – after some search I found the awkward CMD+ALT+CTRL+<number> shortcut. Weird. And – to me – a totally ill logic of dealing with files.

Disclaimer: I hear the arguing, folks, that this is all a matter of getting used to it. Well, if it’s all about accomodation, who’s to claim advancement then?

3. iTunes

My NAS offers an iTunes Server which transforms the NAS’ MP3 library into an iTunes home sharing participant. Theoretically. However, iTunes never manages to discover the home server. iTunes in fact isn’t capable of dealing with my lovely music library by any other means than by adding it to its own library (which obviously is a redundancy overkill AND a lock-in, by the way).

The annoying fact here is, that even though everything is – or: should be – Apple-made, it doesn’t collaborate properly. This isn’t particularly desastrous; it just doesn’t give me the feeling of advancement before any Windows machine.

4. Finally – the BSOD comparison

I had 3 crashes. Already. Within the first month of use. 3 crashes that were more or less as significant as a BSOD on Windows. Mind(!): None of those 3 crahses where related to any non-Apple apps. I do have regular crashes of the Microsoft Office suite – for whatever reason. Office-on-Mac doesn’t seem to be really stable (need it anyway, so what can I do :)). At least re-starting it from an SSD is sufficiently fast.

Anyway – the 3 total crashes where as follows:

  1. Finder became unresponsive. As unresponsive as to prohibit itself from starting and force-quitting. Seemingly due to this, OS X refrained from shutting down, claiming that a program was hanging. Ultimately the only way of getting it to work again was to go for the 4-sec-power-key option. Well known from my old Windows computer. So: No difference here (and I never found out what made it so unresponsive; this one happened twice so far, btw)
  2. Network switching: It  seems OS X is pretty weak on TCP/IP (wired LAN or WiFI – whatsoever). I have a NAS connected when on private LAN (via SMB; AFP didn’t work for whatever reason). When leaving the private LAN without properly ejecting mounted drives, sometimes – unpredictably – the whole system hangs and remains as unresponsive as above. It may be that I am just too impatient to wait for it to respond again, but – well: I consider that a crash. Less desastrous ones happen ever and ever again when switching between networks, hotspots, … (e.g. when on travel). I already got used to that. Obviously network is the weak point in OS X.
  3. Printer Driver: I added an HP LaserJet to the list of printers, allowed OS X to download the appropriate driver from the AppStore, later disconnected from Ethernet and switched to WiFi-only mode becaue of a meeting and – booom – no more mouse/pad/keyboard interaction possible. Apps kept running. They even reacted to events. But I could by no means interact with them. Again: 4-sec-power-key-force-shutdown (little sidenote: the behaviour was reproducable until I deleted the printer completely).


And the learnings of all this – fortunately, for me:

  • Religion is a dangerous thing
  • Reality could proove religion wrong
  • Fruits aren’t necessarily healthy

Seriously spoken: The MacBook Pro on OS X Yosemite (recently upgraded to El Capitan) isn’t that much of an advancement to any properly setup and maintained Microsoft machine. And eventually I can now discuss the matter based on real-world experiences. This is particularly disturbing as one claim of the fruit guys always was and is, that because of the homogenity of hardware, OS and software that would be the case. Well, it isn’t.

That’s by no means particularly bad. I got a Windows tablet, an Android mobile and a Mac working horse now. Where there’s software, there’s errors. On any of the devices. That was and will remain true for all time. One just shouldn’t claim tremendous advance just because of a brand — though, to be honest, there’s one thing that I do like with my new toy: It shuts down and boots so brilliantly fast that work interruptions due to whatever error aren’t really hurting that much anymore – at least after the first 4 weeks.

Let’s see whether it remains like that.



Published by:

Evaluation Report – Monitoring Comparison: newRelic vs. Ruxit

I’ve worked on cloud computing frameworks with a couple of companies meanwhile. DevOps like processes are always an issue along with these cooperations – even more when it comes to monitoring and how to innovatively approach the matter.

As an example I am ever and again emphasizing Netflix’s approach in these conversations: I very much like Netflix’s philosophy of how to deploy, operate and continuously change environment and services. Netflix’s different component teams do not have any clue on the activities of other component teams; their policy is that every team is self-responsible for changes not to break anything in the overall system. Also, no one really knows in detail which servers, instances, services are up and running to serve requests. Servers and services are constantly automatically re-instantiated, rebooted, added, removed, etc. Such is a philosophy to make DevOps real.

Clearly, when monitoring such a landscape traditional (SLA-fulfilment oriented) methods must fail. It simply isn’t sufficient for a Cloud-aware, continuous delivery oriented monitoring system to just integrate traditional on-premise monitoring solutions like e.g. Nagios with e.g. AWS’ CloudWatch. Well, we know that this works fine, but it does not yet ease the cumbersome work of NOCs or Application Operators to quickly identify

  1. the impact of a certain alert, hence its priority for ongoing operations and
  2. the root cause for a possible error

After discussing these facts the umpteenth time and (again) being confronted with the same old arguments about the importance of ubiquitous information on every single event within a system (for the sake of proving SLA compliancy), I thought to give it a try and dig deeper by myself to find out whether these arguments are valid (and I am therefore wrong) or whether there is a possibility to substantially reduce event occurrence and let IT personal only follow up the really important stuff. Efficiently.

At this stage, it is time for a little

DISCLAIMER: I am not a monitoring or APM expert; neither am I a .NET programming expert. Both skill areas are fairly familiar to me, but in this case I intentionally approached the matter from a business perspective – as least technical as possible.

The Preps

In autumn last year I had the chance to get a little insight into 2 pure-SaaS monitoring products: Ruxit and newRelic. Ruxit back then was – well – a baby: Early beta, no real functionality but a well-received glimpse of what the guys are on for. newRelic was already pretty strong and I very much liked their light and quick way of getting started.

As that project back then got stuck and I ended my evaluations in the middle of getting insight, I thought, getting back to that could be a good starting point (especially as I wasn’t able to find any other monitoring product going the SaaS path that radically, i.e. not even thinking of offering an on-premise option; and as a cloud “aficionado” I was very keen on seeing a full-stack SaaS approach). So the product scope was set pretty straight.

The investigative scope, this time, should answer questions a bit more in a structured way:

  1. How easy is it to kick off monitoring within one system?
  2. How easy is it to combine multiple systems (on-premise and cloud) within one easy-to-digest overview?
  3. What’s alerted and why?
  4. What steps are needed in order to add APM to a system already monitored?
  5. Correlation of events and its appearance?
  6. The “need to know” principle: Impact versus alert appearance?

The setup I used was fairly simple (and reduced – as I didn’t want to bother our customer’s workloads in any of their datacenters): I had an old t1.micro instance still lurking around on my AWS account; this is 1 vCPU with 613MB RAM – far too small to really perform with the stuff I wanted it to do. I intentionally decided to use that one for my tests. Later, the following was added to the overall setup:

  • An RDS SQL Server database (which I used for the application I wanted to add to the environment at a later stage)
  • IIS 6 (as available within the Server image that my EC2 instance is using)
  • .NET framework 4
  • Some .NET sample application (some “Contoso” app; deployed directly from within Visual Studio – no changes to the defaults)

Immediate Observations

2 things popped into my eyes only hours (if not minutes) after commencing my activities in newRelic and Ruxit, but let’s first start with the basics.

Setting up accounts is easy and straight forward in both systems. They are both truly following the cloud affine “on-demand” characteristic. newRelic creates a free “Pro” trial account which is converted into a lifetime free account when not upgraded to “paid” after 14 days. Ruxit sets up a free account for their product but takes a totally different approach – closer resembling to consumption-based pricing: you get 1000 hours of APM and 50k user visits for free.

Both systems follow pretty much the same path after an account has been created:

  • In the best case, access your account from within the system you want to monitor (or deploy the downloaded installer package – see below – to the target system manually)
  • Download the appropriate monitoring agent and run the installer. Done.

Both agents started to collect data immediately and the browser-based dashboards produced the first overview of my system within some minutes.

As a second step, I also installed the agents to my local client machine as I wanted to know how the dashboards display multiple systems – and here’s a bummer with Ruxit: My antivirus scanner alerted me with an Win32.Evo-Gen suspicion:

Avast virus alert upon Ruxit agent install

Avast virus alert upon Ruxit agent install

It wasn’t really a problem for the agent to install and operate properly and produce data; it was just a little confusing. In essence, the reason for this is fairly obvious: The agent is using a technique which is comparable to typical virus intrusion patterns, i.e. sticking its fingers deep into the system.

The second observation was newRelics approach to implement web browser remote checks, called “Synthetics”. It was indeed astonishingly easy to add a URL to the system and let newRelic do their thing – seemingly from within the AWS datacenters around the world. And especially with this, newRelic has a very compelling way of displaying the respective information on their Synthetics dashboard. Easy to digest and pretty comprehensive.

At the time when I started off with my evaluation, Ruxit didn’t offer that. Meanwhile they added their Beta for “Web Checks” to my account. Equally easy to setup but lacking some more rich UI features wrt display of information. I am fairly sure that this’ll be added soon. Hopefully. My take is, that combining system monitoring or APM with insights displaying real user usage patterns is an essential part to efficiently correlate events.


I always spend a second thought on security questions, hence contemplated Ruxit’s way of making sure that an agent really connects to the right tenant when being installed. With newRelic you’re confronted with an extra step upon installation: They ask you to copy+paste a security key from your account page during their install procedure.

newRelic security key example

newRelic security key example

Ruxit doesn’t do that. However, they’re not really less secure; it’s just that they pre-embed this key into the installer package that is downloaded,c so they’re just a little more convenient. Following shows the msiexec command executed upon installation as well as its parameters taken form the installer log (you can easily find that information after the .exe package unpacks into the system’s temp folder):

@msiexec /i "%i_msi_dir%\%i_msi%" /L*v %install_log_file% SERVER="%i_server%" PROCESSHOOKING="%i_hooking%" TENANT="%i_tenant%" TENANT_TOKEN="%i_token%" %1 %2 %3 %4 %5 %6 %7 %8 %9 >con:


After having applied the package (both packages) onto my Windows Server on EC2 things popped up quickly within the dashboards (note, that both dashboard screenshots are from a later evaluation stage; however, the basic layout was the very same at the beginning – I didn’t change anything visually down the road).

newRelic server monitoring dashboard

newRelic server monitoring dashboard showing the limits of my too-small instance 🙂

Ruxit server monitoring dashboard

The Ruxit dashboard on the same server; with a clear hint on a memory problem 🙂

What instantly stroke me here was the simplicity of Ruxit’s server monitoring information. It seemed sort-of “thin” on information (if you want a real whole lot of info right from the start, you probably prefer newRelic’s dashboard). Things, though, changed when my server went into memory saturation (which it constantly does right away when accessed via RDP). At that stage, newRelic started firing eMails alerting me of the problem. Also, the dashboard went red. Ruxit in turn did nothing really. Well, of course, it displayed the problem once I was logged into the dashboard again and had a look at my server’s monitoring data; but no alert triggered, no eMail, no red flag. Nothing.

If you’re into SLA fulfilment, then that is precisely the moment to become concerned. On second thought, however, I figured that actually no one was really bothered by the problem. There was no real user interaction going on in that server instance. I hadn’t even added an app really. Hence: why bother?

So, next step was to figure out, why newRelic went so crazy with that. It turned out that with newRelic every newly added server gets assigned to a default server policy.

newRelic's monitoring policy configuration

newRelic’s monitoring policy configuration

I could turn off that policy easily (also editing apparently seems straight forward; I didn’t try). However, to think that with every server I’m adding I’d have to figure out first, which alerts are important as they might be impacting someone or something seemed less on a “need to know” basis than I intended to have.

After having switched off the policy, newRelic went silent.

BTW, alerting via eMail is not setup by default in Ruxit; within the tenant’s settings area, this can be added as a so called “Integration” point.

AWS Monitoring

As said above, I was keen to know how both systems integrate multiple monitoring sources into their overviews. My idea was to add my AWS tenant to be monitored (this resulted from the previously mentioned customer conversations I had had earlier; that customer’s utmost concern was to add AWS to their monitoring overview – which in their case was Nagios, as said).

A nice thing with Ruxit is that they fill their dashboard with those little demo tiles, which easily lead you into their capabilities without having setup anything yet (the example below shows the database demo tile).

Ruxit demo tile example

This is one of the demo tiles in Ruxit’s dashboard – leading to DB monitoring in this case

I found an AWS demo tile (similar to the example above), clicked and ended up with a light explanation of how to add an AWS environment to my monitoring ecosystem ( They offer key based or role based access to your AWS tenant. Basically what they need you to do is these 3 steps:

  1. Create either a role or a user (for use of access key based connection)
  2. Apply the respective AWS policy to that role/user
  3. Create a new cloud monitoring instance within Ruxit and connect it to that newly created AWS resource from step 1

Right after having executed the steps, the aforementioned demo tiled changed into displaying real data and my AWS resources showed up (note, that the example below already contains RDS, which I added at a later stage; the cool thing here was, that that was added fully unattended as soon as I had created it in AWS).

Ruxit AWS monitoring overview

Ruxit AWS monitoring overview

Ruxit essentially monitors everything within AWS which you can put a CloudWatch metric on – which is a fair lot, indeed.

So, next step clearly was to seek the same capability within newRelic. As far as I could work out, newRelic’s approach here is to offer plugins – and newRelic’s plugin ecosystem is vast. That may mean, that there’s a whole lot of possibilities for integrating monitoring into the respective IT landscape (whatever it may be); however, one may consider the process to add plugin after plugin (until the whole landscape is covered) a bit cumbersome. Here’s a list of AWS plugins with newRelic:

newRelic plugins for AWS

newRelic plugins for AWS

newRelic plugins for AWS

newRelic plugins for AWS


Adding APM to my monitoring ecosystem was probably the most interesting experience in this whole test: As a preps for the intended result (i.e.: analyse data about a web application’s performance at real user interaction) I added an IIS to my server and an RDS database to my AWS account (as mentioned before).

The more interesting fact, though, was that after having finalized the IIS installation, Ruxit instantly showed the IIS services in their “Smartscape” view (more on that a little later). I didn’t have to change anything in my Ruxit environment.

newRelic’s approach is a little different here. The below screenshot shows their APM start page with .NET selected.

newRelic APM start page with .NET selected

newRelic APM start page with .NET selected

After having confirmed each selection which popped up step by step, I was presented with a download link for another agent package which I had to apply to my server.

The interesting thing, though, was, that still nothing showed up. No services or additional information on any accessible apps. That is logical in a way, as I did not have anything published on that server yet which resembled an application, really. The only thing that was accessible from the outside was the IIS default web (just showing that IIS logo).

So, essentially the difference here is that with newRelic you get system monitoring with a system monitoring agent, and by means of an application monitoring agent you can add monitoring of precisely the type of application the agent is intended for.

I didn’t dig further yet (that may be subject for another article), but it seems that with Ruxit I can have monitoring for anything going on on a server by means of just one install package (maybe one more explanation for the aforementioned virus scan alert).

However, after having published my .NET application, everything was fine again in both systems – and the dashboards went red instantly as the server went into CPU saturation due to its weakness (as intended ;)).

Smartscape – Overview

So, final question to answer was: What do the dashboards show and how do they ease (root cause) analysis?

As soon as the app was up and running and web requests started to role in, newRelic displayed everything to know about the application’s performance. Particularly nice is the out-of-the-box combination of APM data with browser request data within the first and the second menu item (either switch between the 2 by clicking the menu or use the links within the diagrams displayed).

newRelic APM dashboard

newRelic APM dashboard

The difficulty with newRelic was to discover the essence of the web application’s problem. Transactions and front-end code performance was displayed in every detail, but I knew (from my configuration) that the problem of slow page loads – as displayed – lied in the general weakness of my web server.

And that is basically where Ruxit’s smartscape tile in their dashboard made the essential difference. The below screenshot shows a problem within my web application as initially displayed in Ruxit’s smartscape view:

Ruxit's smartscape view showing a problem in my application

Ruxit’s smartscape view showing a problem in my application

By this view, it was obvious that the problem was either within the application itself or within the server as such. A click to the server not only reveals the path to the depending web application but also other possibly impacted services (obviously without end user impact as otherwise there would be an alert on them, too).

Ruxit smartscape with dependencies between servers, services, apps

Ruxit smartscape with dependencies between servers, services, apps

And digging into the server’s details revealed the problem (CPU saturation, unsurprisingly).

Ruxit revealing CPU saturation as a root cause

Ruxit revealing CPU saturation as a root cause

Still, the amount of dashboard alerts where pretty few. While I had 6 eMails from newRelic telling me about the problem on that server, I had only 2 within Ruxit: 1 telling me about the web app’s weak response and another about CPU saturation.

Next step, hence, would be to scale-up the server (in my environment) or scale-out or implement an enhanced application architecture (in a realistic production scenario). But that’s another story …

Bottom line

Event correlation and alerting on a “need to know” basis – at least for me – remains the right way to go.

This little test was done with just one server, one database, one web application (and a few other services). While newRelics comprehensive approach to showing information is really compelling and perfectly serves the objective of complete SLA compliancy reporting, Ruxit’s “need to know” principle much more meets the needs of what I would expect form innovative cloud monitoring.

Considering Netflix’s philosophy from the beginning of this article, innovative cloud monitoring basically translates into: Every extra step is a burden. Every extra information on events without impact means extra OPS effort. And every extra-click to correlate different events to a probable common root-cause critically lengthens MTTR.

A “need to know” monitoring approach while at the same time offering full stack visibility of correlated events is – for me – one step closer to comprehensive Cloud-ready monitoring and DevOps.

And Ruxit really seems to be “spot on” in that respect!


Published by:
%d bloggers like this: