The Smile-IT Blog » Blog Archives

Tag Archives: SaaS

How to StartUp inside an Enterprise

I’ve been following Ruxit for quite some time now. In 2014, I first considered them for the Cloud delivery framework we were to create. Later – during another project – I elaborated on a comparison I did between Ruxit and newRelic; I was convinced by their “need to know” approach to monitor large diverse application landscapes.

Recently they added Docker Monitoring into their portfolio and expanded support for highly dynamic infrastructures; here’s a great webinar on that (be sure to watch closely on the live demos – compelling).

But let’s – for once – let aside the technical masterpieces in their development; let’s have a look on their strategic procession:

Dynatrace – the mothership – has been a well-known player in the monitoring field for years. I am working for quite some customers who leverage Dynatrace’s capabilities. I would not hesitate to call them a well-established enterprise. Especially in the field of cloud, well established enterprises tend to leak a certain elasticity in order to get their X-aaS initiatives to really lift-off; examples are manifold: Canopy failed eventually (my 2 cents; some may see that differently), IBM took a long time to differentiate their cloud from the core business, … some others still market their cloud endeavours sideways their core business – not for the better.

And then – last week – I received Ruxit’s eMail announcing “Ruxit grows up… announcing Dynatrace Ruxit!“, officially sent by “Bernd Greifeneder | Founder and CTO”. I was expecting that eMail; in the webinar mentioned before, slides were already branded “Dynatrace Ruxit”, and the question I raised on this, was answered expectedly, that from a successful startup-like endeavour they would now commence their move back into the parent company.

Comprehensible.

Because that is precisely what a disruptive endeavour inside a well-established company should look like: Greifeneder was obviously given the trust and money to ramp-up a totally new kind of business alongside Dynatrace’s core capabilities. I have long lost any doubts, that Ruxit created a new way of technologically and methodically doing things in Monitoring: In a container-based elastic cloud environment, there’s no need anymore to know about each and every entity; the only importance is to keep things alright for endusers – and when this is not the case, let admins quickly find the problem, and nothing else.

What – though – really baffled me was the rigorous way of pushing their technology into the market: I used to run a test account for running a few tests there and then for my projects. Whenever I logged in, something new had been deployed. Releases happened on an amazingly regular basis – DevOps style 100%. There is no way of doing this within established development processes and traditional on-premise release management. One may be able to derive traditional releases from DevOps-like continuous delivery – but not vice versa.

Bottom line: Greifeneder obviously had the possibility, the ability and the right people to do things in a totally different way from the mothership’s processes. I, of course, do not have insight in how things were really setup within Dynatrace – but last week they took their baby back into “mother’s bosom”, and in the cloud business – I’d argue – that does not happen when the baby isn’t ready to live on its own.

Respect!

Enterprise cloud and digitalisation endeavours may get their learnings from Dynatrace Ruxit. Wishing you a sunny future, Dynatrace Monitoring Cloud!

 

Published by:

Scaling: Where to?

Pushed by some friends – and at a fortunately discounted price – I finally gave in and made it to an AWS exam; so call me certified now. Just so.

Solutions-Architect-Associate

However, this has nothing to do with the matter in discussion, here: While awaiting the exam, I talked to some fellow cloud geeks and was surprised again by them confusing up and down and out and in again – as experienced so many times before; so::: this is about “Scaling”!

And it’s really simple – here’s the boring piece:

The Prinicple

Both scaling patterns (up/down and out/in) in essence serve the same purpose and act by the same principle: Upon explicit demand or implicit recognition the amount of compute/memory resources are increased or decreased. Whether this is done

  • proactively or reactively (see guidelines above)
  • automatically
  • or manually through a portal by authorized personal

is subject to the framework implementation possibilities and respective architecture decisions.

What’s up and down?

For the scale-up/-down pattern, the hypervisor (or IaaS service layer) managing cloud compute resources has to provide the ability to dynamically (expectedly without outage; but that’s not granted) increase the compute and memory resources of a single machine instance.

The trigger for scaling execution can either be implemented within a dedicated cloud automation engine, within the Scalability building block or as part of a self-service portal command, depending on the intended flexibility.

What’s in and out?

The same principles apply as for the scale-up/-down pattern; however on scaling out an additional instance is created for the same service. This may involve the following alternatives:

  • Create an instance and re-configure the service to route requests to the new instance additionally
  • Create an instance, re-configure the service to route requests to the newly created instance only and de-provision an existing instance with lower capacity accordingly

Both cases possibly demand for (automated) loadbalancer reconfiguration and for the capability of the application to deal with changing server resources.

Respectively, scale-in means to de-provision instances once load parameters have sufficiently decreased. An application on top has to be able to deal with dynamically de-provisioned server instances. In a scenario where the application’s data layer is involved in the scaling process (i.e. a DBMS is part of the server to be de-provisioned) measures have to be taken by the application to reliably persist data before shutdown of the respective resource.

And now – for the more funny part

It occured to me that two silly graphics could ease to remember the difference, hence I do invite you all to think of a memory IC as a little bug climbing up a ladder and in turn of computers bursting out a data center. Does that make it easier to distinguish the two patterns?

scale-up

Scaling up and down

scale-out

Scaling up and down

 

You think you don’t need to care?

Well – as a SaaS consumer you’re right: As long as your tenant scales and repsonds accurately to any performance demand – no worries. But as soon as things deviate from this, one’s in trouble in terms of finding the right application architecture when it is unclear whether you’re to scale resources or instances. So — remember the bug and the fleeing servers 🙂

 

Published by:
SmileIT

Evaluation Report – Monitoring Comparison: newRelic vs. Ruxit

I’ve worked on cloud computing frameworks with a couple of companies meanwhile. DevOps like processes are always an issue along with these cooperations – even more when it comes to monitoring and how to innovatively approach the matter.

As an example I am ever and again emphasizing Netflix’s approach in these conversations: I very much like Netflix’s philosophy of how to deploy, operate and continuously change environment and services. Netflix’s different component teams do not have any clue on the activities of other component teams; their policy is that every team is self-responsible for changes not to break anything in the overall system. Also, no one really knows in detail which servers, instances, services are up and running to serve requests. Servers and services are constantly automatically re-instantiated, rebooted, added, removed, etc. Such is a philosophy to make DevOps real.

Clearly, when monitoring such a landscape traditional (SLA-fulfilment oriented) methods must fail. It simply isn’t sufficient for a Cloud-aware, continuous delivery oriented monitoring system to just integrate traditional on-premise monitoring solutions like e.g. Nagios with e.g. AWS’ CloudWatch. Well, we know that this works fine, but it does not yet ease the cumbersome work of NOCs or Application Operators to quickly identify

  1. the impact of a certain alert, hence its priority for ongoing operations and
  2. the root cause for a possible error

After discussing these facts the umpteenth time and (again) being confronted with the same old arguments about the importance of ubiquitous information on every single event within a system (for the sake of proving SLA compliancy), I thought to give it a try and dig deeper by myself to find out whether these arguments are valid (and I am therefore wrong) or whether there is a possibility to substantially reduce event occurrence and let IT personal only follow up the really important stuff. Efficiently.

At this stage, it is time for a little

DISCLAIMER: I am not a monitoring or APM expert; neither am I a .NET programming expert. Both skill areas are fairly familiar to me, but in this case I intentionally approached the matter from a business perspective – as least technical as possible.

The Preps

In autumn last year I had the chance to get a little insight into 2 pure-SaaS monitoring products: Ruxit and newRelic. Ruxit back then was – well – a baby: Early beta, no real functionality but a well-received glimpse of what the guys are on for. newRelic was already pretty strong and I very much liked their light and quick way of getting started.

As that project back then got stuck and I ended my evaluations in the middle of getting insight, I thought, getting back to that could be a good starting point (especially as I wasn’t able to find any other monitoring product going the SaaS path that radically, i.e. not even thinking of offering an on-premise option; and as a cloud “aficionado” I was very keen on seeing a full-stack SaaS approach). So the product scope was set pretty straight.

The investigative scope, this time, should answer questions a bit more in a structured way:

  1. How easy is it to kick off monitoring within one system?
  2. How easy is it to combine multiple systems (on-premise and cloud) within one easy-to-digest overview?
  3. What’s alerted and why?
  4. What steps are needed in order to add APM to a system already monitored?
  5. Correlation of events and its appearance?
  6. The “need to know” principle: Impact versus alert appearance?

The setup I used was fairly simple (and reduced – as I didn’t want to bother our customer’s workloads in any of their datacenters): I had an old t1.micro instance still lurking around on my AWS account; this is 1 vCPU with 613MB RAM – far too small to really perform with the stuff I wanted it to do. I intentionally decided to use that one for my tests. Later, the following was added to the overall setup:

  • An RDS SQL Server database (which I used for the application I wanted to add to the environment at a later stage)
  • IIS 6 (as available within the Server image that my EC2 instance is using)
  • .NET framework 4
  • Some .NET sample application (some “Contoso” app; deployed directly from within Visual Studio – no changes to the defaults)

Immediate Observations

2 things popped into my eyes only hours (if not minutes) after commencing my activities in newRelic and Ruxit, but let’s first start with the basics.

Setting up accounts is easy and straight forward in both systems. They are both truly following the cloud affine “on-demand” characteristic. newRelic creates a free “Pro” trial account which is converted into a lifetime free account when not upgraded to “paid” after 14 days. Ruxit sets up a free account for their product but takes a totally different approach – closer resembling to consumption-based pricing: you get 1000 hours of APM and 50k user visits for free.

Both systems follow pretty much the same path after an account has been created:

  • In the best case, access your account from within the system you want to monitor (or deploy the downloaded installer package – see below – to the target system manually)
  • Download the appropriate monitoring agent and run the installer. Done.

Both agents started to collect data immediately and the browser-based dashboards produced the first overview of my system within some minutes.

As a second step, I also installed the agents to my local client machine as I wanted to know how the dashboards display multiple systems – and here’s a bummer with Ruxit: My antivirus scanner alerted me with an Win32.Evo-Gen suspicion:

Avast virus alert upon Ruxit agent install

Avast virus alert upon Ruxit agent install

It wasn’t really a problem for the agent to install and operate properly and produce data; it was just a little confusing. In essence, the reason for this is fairly obvious: The agent is using a technique which is comparable to typical virus intrusion patterns, i.e. sticking its fingers deep into the system.

The second observation was newRelics approach to implement web browser remote checks, called “Synthetics”. It was indeed astonishingly easy to add a URL to the system and let newRelic do their thing – seemingly from within the AWS datacenters around the world. And especially with this, newRelic has a very compelling way of displaying the respective information on their Synthetics dashboard. Easy to digest and pretty comprehensive.

At the time when I started off with my evaluation, Ruxit didn’t offer that. Meanwhile they added their Beta for “Web Checks” to my account. Equally easy to setup but lacking some more rich UI features wrt display of information. I am fairly sure that this’ll be added soon. Hopefully. My take is, that combining system monitoring or APM with insights displaying real user usage patterns is an essential part to efficiently correlate events.

Security

I always spend a second thought on security questions, hence contemplated Ruxit’s way of making sure that an agent really connects to the right tenant when being installed. With newRelic you’re confronted with an extra step upon installation: They ask you to copy+paste a security key from your account page during their install procedure.

newRelic security key example

newRelic security key example

Ruxit doesn’t do that. However, they’re not really less secure; it’s just that they pre-embed this key into the installer package that is downloaded,c so they’re just a little more convenient. Following shows the msiexec command executed upon installation as well as its parameters taken form the installer log (you can easily find that information after the .exe package unpacks into the system’s temp folder):

@msiexec /i "%i_msi_dir%\%i_msi%" /L*v %install_log_file% SERVER="%i_server%" PROCESSHOOKING="%i_hooking%" TENANT="%i_tenant%" TENANT_TOKEN="%i_token%" %1 %2 %3 %4 %5 %6 %7 %8 %9 >con:
MSI (c) (5C:74) [13:35:21:458]: Command Line: SERVER=https://qvp18043.live.ruxit.com:443 PROCESSHOOKING=1 TENANT=qvp18043 TENANT_TOKEN=ABCdefGHI4JKLM5n CURRENTDIRECTORY=C:\Users\thome\Downloads CLIENTUILEVEL=0 CLIENTPROCESSID=43100

Alerting

After having applied the package (both packages) onto my Windows Server on EC2 things popped up quickly within the dashboards (note, that both dashboard screenshots are from a later evaluation stage; however, the basic layout was the very same at the beginning – I didn’t change anything visually down the road).

newRelic server monitoring dashboard

newRelic server monitoring dashboard showing the limits of my too-small instance 🙂

Ruxit server monitoring dashboard

The Ruxit dashboard on the same server; with a clear hint on a memory problem 🙂

What instantly stroke me here was the simplicity of Ruxit’s server monitoring information. It seemed sort-of “thin” on information (if you want a real whole lot of info right from the start, you probably prefer newRelic’s dashboard). Things, though, changed when my server went into memory saturation (which it constantly does right away when accessed via RDP). At that stage, newRelic started firing eMails alerting me of the problem. Also, the dashboard went red. Ruxit in turn did nothing really. Well, of course, it displayed the problem once I was logged into the dashboard again and had a look at my server’s monitoring data; but no alert triggered, no eMail, no red flag. Nothing.

If you’re into SLA fulfilment, then that is precisely the moment to become concerned. On second thought, however, I figured that actually no one was really bothered by the problem. There was no real user interaction going on in that server instance. I hadn’t even added an app really. Hence: why bother?

So, next step was to figure out, why newRelic went so crazy with that. It turned out that with newRelic every newly added server gets assigned to a default server policy.

newRelic's monitoring policy configuration

newRelic’s monitoring policy configuration

I could turn off that policy easily (also editing apparently seems straight forward; I didn’t try). However, to think that with every server I’m adding I’d have to figure out first, which alerts are important as they might be impacting someone or something seemed less on a “need to know” basis than I intended to have.

After having switched off the policy, newRelic went silent.

BTW, alerting via eMail is not setup by default in Ruxit; within the tenant’s settings area, this can be added as a so called “Integration” point.

AWS Monitoring

As said above, I was keen to know how both systems integrate multiple monitoring sources into their overviews. My idea was to add my AWS tenant to be monitored (this resulted from the previously mentioned customer conversations I had had earlier; that customer’s utmost concern was to add AWS to their monitoring overview – which in their case was Nagios, as said).

A nice thing with Ruxit is that they fill their dashboard with those little demo tiles, which easily lead you into their capabilities without having setup anything yet (the example below shows the database demo tile).

Ruxit demo tile example

This is one of the demo tiles in Ruxit’s dashboard – leading to DB monitoring in this case

I found an AWS demo tile (similar to the example above), clicked and ended up with a light explanation of how to add an AWS environment to my monitoring ecosystem (https://help.ruxit.com/pages/viewpage.action?pageId=9994248). They offer key based or role based access to your AWS tenant. Basically what they need you to do is these 3 steps:

  1. Create either a role or a user (for use of access key based connection)
  2. Apply the respective AWS policy to that role/user
  3. Create a new cloud monitoring instance within Ruxit and connect it to that newly created AWS resource from step 1

Right after having executed the steps, the aforementioned demo tiled changed into displaying real data and my AWS resources showed up (note, that the example below already contains RDS, which I added at a later stage; the cool thing here was, that that was added fully unattended as soon as I had created it in AWS).

Ruxit AWS monitoring overview

Ruxit AWS monitoring overview

Ruxit essentially monitors everything within AWS which you can put a CloudWatch metric on – which is a fair lot, indeed.

So, next step clearly was to seek the same capability within newRelic. As far as I could work out, newRelic’s approach here is to offer plugins – and newRelic’s plugin ecosystem is vast. That may mean, that there’s a whole lot of possibilities for integrating monitoring into the respective IT landscape (whatever it may be); however, one may consider the process to add plugin after plugin (until the whole landscape is covered) a bit cumbersome. Here’s a list of AWS plugins with newRelic:

newRelic plugins for AWS

newRelic plugins for AWS

newRelic plugins for AWS

newRelic plugins for AWS

Add APM

Adding APM to my monitoring ecosystem was probably the most interesting experience in this whole test: As a preps for the intended result (i.e.: analyse data about a web application’s performance at real user interaction) I added an IIS to my server and an RDS database to my AWS account (as mentioned before).

The more interesting fact, though, was that after having finalized the IIS installation, Ruxit instantly showed the IIS services in their “Smartscape” view (more on that a little later). I didn’t have to change anything in my Ruxit environment.

newRelic’s approach is a little different here. The below screenshot shows their APM start page with .NET selected.

newRelic APM start page with .NET selected

newRelic APM start page with .NET selected

After having confirmed each selection which popped up step by step, I was presented with a download link for another agent package which I had to apply to my server.

The interesting thing, though, was, that still nothing showed up. No services or additional information on any accessible apps. That is logical in a way, as I did not have anything published on that server yet which resembled an application, really. The only thing that was accessible from the outside was the IIS default web (just showing that IIS logo).

So, essentially the difference here is that with newRelic you get system monitoring with a system monitoring agent, and by means of an application monitoring agent you can add monitoring of precisely the type of application the agent is intended for.

I didn’t dig further yet (that may be subject for another article), but it seems that with Ruxit I can have monitoring for anything going on on a server by means of just one install package (maybe one more explanation for the aforementioned virus scan alert).

However, after having published my .NET application, everything was fine again in both systems – and the dashboards went red instantly as the server went into CPU saturation due to its weakness (as intended ;)).

Smartscape – Overview

So, final question to answer was: What do the dashboards show and how do they ease (root cause) analysis?

As soon as the app was up and running and web requests started to role in, newRelic displayed everything to know about the application’s performance. Particularly nice is the out-of-the-box combination of APM data with browser request data within the first and the second menu item (either switch between the 2 by clicking the menu or use the links within the diagrams displayed).

newRelic APM dashboard

newRelic APM dashboard

The difficulty with newRelic was to discover the essence of the web application’s problem. Transactions and front-end code performance was displayed in every detail, but I knew (from my configuration) that the problem of slow page loads – as displayed – lied in the general weakness of my web server.

And that is basically where Ruxit’s smartscape tile in their dashboard made the essential difference. The below screenshot shows a problem within my web application as initially displayed in Ruxit’s smartscape view:

Ruxit's smartscape view showing a problem in my application

Ruxit’s smartscape view showing a problem in my application

By this view, it was obvious that the problem was either within the application itself or within the server as such. A click to the server not only reveals the path to the depending web application but also other possibly impacted services (obviously without end user impact as otherwise there would be an alert on them, too).

Ruxit smartscape with dependencies between servers, services, apps

Ruxit smartscape with dependencies between servers, services, apps

And digging into the server’s details revealed the problem (CPU saturation, unsurprisingly).

Ruxit revealing CPU saturation as a root cause

Ruxit revealing CPU saturation as a root cause

Still, the amount of dashboard alerts where pretty few. While I had 6 eMails from newRelic telling me about the problem on that server, I had only 2 within Ruxit: 1 telling me about the web app’s weak response and another about CPU saturation.

Next step, hence, would be to scale-up the server (in my environment) or scale-out or implement an enhanced application architecture (in a realistic production scenario). But that’s another story …

Bottom line

Event correlation and alerting on a “need to know” basis – at least for me – remains the right way to go.

This little test was done with just one server, one database, one web application (and a few other services). While newRelics comprehensive approach to showing information is really compelling and perfectly serves the objective of complete SLA compliancy reporting, Ruxit’s “need to know” principle much more meets the needs of what I would expect form innovative cloud monitoring.

Considering Netflix’s philosophy from the beginning of this article, innovative cloud monitoring basically translates into: Every extra step is a burden. Every extra information on events without impact means extra OPS effort. And every extra-click to correlate different events to a probable common root-cause critically lengthens MTTR.

A “need to know” monitoring approach while at the same time offering full stack visibility of correlated events is – for me – one step closer to comprehensive Cloud-ready monitoring and DevOps.

And Ruxit really seems to be “spot on” in that respect!

 

Published by:

DevOps style performance monitoring for .NET

 

{{ this article has originally been published in DevOps.com }}

 

Recently I began looking for an application performance management solution for .NET. My requirements are code level visibility, end to end request tracing, and infrastructure monitoring in a DevOps production setup.

DotTrace is clearly the most well-known tool for code level visibility in development setups, but it can’t be used in a 24×7 production setup. DotTrace also doesn’t do typical Ops monitoring.

Unfortunately a Google search didn’t return much in terms of a tool comparison for .NET production monitoring. So I decided to do some research on my own. Following is a short list of well-known tools in the APM space that support .NET. My focus is on finding an end-to-end solution and profiler-like visibility into transactions.

New Relic was the first to do APM SaaS, focused squarely on production with a complete offering. New Relic offers web request monitoring for .NET, Java, and more. It automatically shows a component-based breakdown of the most important requests. The breakdown is fairly intuitive to use and goes down to the SQL level. Code level visibility, at least for .NET, is achieved by manually starting and stopping sampling. This is fine for analyzing currently running applications, but makes analysis of past problems a challenge. New Relic’s main advantage is its ease of us, intuitive UI, and a feature set that can help you quickly identify simple issues. Depth is the main weakness of NewRelic. As soon as you try to dig deeper into the data, you’re stuck. This might be a minor point, but if you’re used to working with a profiler, you’ll miss CPU breakdown as New Relic only shows response times.

net-1-newrelic

Dynatrace is the vendor that started the APM revolution and is definitely the strongest horse in this race. Its feature set in terms of .NET is the most complete, offering code level monitoring (including CPU and wait times), end to end tracing, and user experience monitoring. As far as I can determine, it’s the only tool with a memory profiler for .NET and it also features IIS web request insight. It supports the entire application life cycle from development environments, to load testing, to production. As such it’s nearly perfect for DevOps. Due to its pricing structure and architecture it’s targeted more at the mid to enterprise markets. In terms of ease of use it’s catching up to competition with a new Web UI. It’s rather light on infrastructure monitoring on its own, but shows additional strength with optional Dynatrace synthetic and network monitoring components.

net-2-dynatrace

Ruxit is a new SaaS solution built by Dynatrace. It’s unique in that it unites application performance management and real user monitoring with infrastructure, cloud, and network monitoring into a single product. It is by far the easiest to install, literally takes 2 minutes. It features full end to end tracing, code level visibility down to the method level, SQL visibility, and RUM for .NET, Java, and other languages, with insight into IIS and Apache. Apart from this it has an analytics engine that delivers both technical and user experience insights. Its main advantages are its ease of use, web UI, fully automated root cause analysis, and frankly, amazing breadth. Its flexible consumption based pricing scales from startups, cloud natives, and mid markets up to large web scale deployments of ten-thousands of servers.

net-3-ruxit

AppNetta‘s TraceView takes a different approach to application performance management. It does support tracing across most major languages including database statements and of course .NET. It visualizes things in charts and scatter plots. Even traces across multiple layers and applications are visualized in graphs. This has its advantages but takes some time getting used to it. Unfortunately while TraceView does support .NET it does not yet have code level visibility for it. This makes sense for AppNetta, which as a whole is more focused on large scale monitoring and has more of a network centric background. For DevOps in .NET environments however, it’s a bit lacking.

net-4-TraceView

Foglight, originally owned by Quest and now owned by Dell, is a well-known application performance management solution. It is clearly meant for operations monitoring and tracks all web requests. It integrates infrastructure and application monitoring, end to end tracing, and code level visibility on .NET, among other things. It has the required depth, but it’s rather complex to set up and obviously generates alert storms as far as I could experience. It takes a while to configure and get the data you need. Once properly set up though, you get a lot of insight into your .NET application. In a fast moving DevOps scenario though it might take too long to manually adapt to infrastructure changes.

net-5-foglight

AppDynamics is well known in the APM space. Its offering is quite complete and it features .NET monitoring, quite nice transaction flow tracing, user experience, and code level profiling capabilities. It is production capable, though code level visibility may be limited here to reduce overhead. Apart from these features though, AppDynamics has some weaknesses, mainly the lack of IIS request visibility and the fact that it only features walk clock time with no CPU breakdown. Its flash-based web UI and rather cumbersome agent configuration can also be counted as negatives. Compared to others it’s also lacking in terms of infrastructure monitoring. Its pricing structure definitely targets the mid market.

net-6-AppDynamics

Manage Engine has traditionally focused on IT monitoring, but in recent years they added end user and application performance monitoring to their portfolio called APM Insight. Manage Engine does give you metric level insight into .NET applications and transaction trace snap shots which give you code level stack traces and database interactions. However it’s apparent that Manage Engine is a monitoring tool and APM insight doesn’t provide the level of depth one might be accustomed to from other APM tools and profilers.

net-7-ME

JenniferSoft is a monitoring solution that provides nice real-time dashboarding and gives an overview of the topology of your environment. It enables users to see deviations in the speed of transactions with real time scatter charts and analysis of transactions. It provides “profiling” for IIS/.NET transactions, but only on single tiers and has no transaction tracing. Their strong suit is clearly cool dashboarding but not necessarily analytics. For example, they are the only vendor that features 3D animated dashboards.

net-8-JenniferSoft

Conclusion: There’s more buzz around on the APM space than a Google search would reveal on first sight and I did actually discover some cool vendors to target my needs; however, the field clears up pretty much when you dig for end-2-end visibility from code down to infrastructure, including RUM, any web service requests and deep SQL insights. And if you want to pair that with a nice, fluent, ease-of-use web UI and efficient analytics, there’s actually not many left …

Published by:

Synced – but where?

We had eventually setup our Office 365 (O365) tenant for eMail (read about that e.g. in the “Autodiscover” post) and, of course, wanted to leverage Sharepoint as well. My-sharepoint, either. And the “OneDrive for Business” Sync client (ODB) … whatelse.

It wasn’t without further ado that this was accomplished …

Setup

is very straight forward, indeed. Go to your “OneDrive” in the O365 portal and click the “sync” link at the top of the page:

O365 Sync Link, displayed

O365 Sync Link, displayed

Presuming, you got the Office on-premise applications installed on your PC, items will quickly commence showing up in your “OneDrive for Business” folder within the “Favorites” area of explorer.

Also, ODB is nice enough to offer you to jump to that folder by just clicking the little button in the bottom right corner of the confirmation dialog, that appears after having issued syncing:

O365 ODB Confirmation popup

O365 ODB Confirmation popup

Easy, isn’t it.

Sharing

Now, having files made accessible in Explorer, next thing would be to share them with others in your organization. ODB is nice in that, as well, as it offers you a “Share…” option in the Explorer context menu of ODB by which you’re able to launch a convenient “share’n’invite” browser popup with all necessary options:

O365 ODB Share Options

O365 ODB Share Options

Also that one is very straight forward in that

  • you just type in a name,
  • O365 will help you with auto-completion of known names,
  • you select whether people shall be able to edit or to only view items
  • you’re even able to create an “edit” or “view” link which will allow people to access items without dedicated invitation
  • etc.

So – no rocket science here. Users will easily be able to share their files with others. And once one’s done with sharing, invited colleagues will receive an eMail invite to the newly shared stuff which takes them into their browser and into O365 to actually see what’s newly shared with them.

Great!

And now …

Get that into your Windows Explorer!

Once, the necessary items were shared with every user of our tenant as needed, at least I was going right into my ODB sync folder in Explorer to await the newly shared files showing up. O-key, ODB takes a little while to sync it all (Hans Brender, Microsoft MVP for ODB, hence a real expert, wrote a great post on how syncing works in ODB). However, even waiting infinitely wouldn’t lead to us seeing any shared files. What we pretty quickly learned was, that the ODB sync client will – in its initial default setup – never ever sync anything that was shared with you. Only your own files will show up in Explorer. Period.

Makes no sense really for collaboration, does it? But …:

Here’s some solutions:

1. Access files that are shared only through your browser

Anything that has been shared with you, is accessible within your browser-based ODB access. Just click “shared with me” in the navigation on the left of the ODB portal and you’ll see it.

O365 ODB: Shared with me

O365 ODB: Shared with me

Pretty lame, though, for anyone that’s used to work from within the Explorer and not e.g. from the browser or any of the Office applications.

2. Create a teamsite for documents shared between multiple colleagues

O365 with its Sharepoint functionality, of course, does offer the ability to create a site – which also contains a document library. Documents put there are available for anyone with permissions on that site. Permissions can even be set on a more granular level (e.g. management-only for certain folders, team-A on team-A-folders only, etc.).

Navigating to that site’s document library offers you the same “sync” link possibility as with your own files (see screenshot above), i.e. in a few moments ODB will sync any file that’s eligible for you to be viewed or edited.

Nice. But what if creating umpteenth sites for all the different project and team setups within your company is just not what you want? Or what if managing all the various permission sets within one site is just beyond the acceptable effort for your IT team? There’s at least one more possibility that might help:

3. Sync your team-mates ODB document library

As you already know, every O365 user has its own ODB site which is invoked when one clicks to OneDrive in the O365 portal. When being invited to view or edit shared files and brought to the respective ODB site in your browser, you actually end up within the document library of someone else:

O365 ODB: Sync someone else's docs

O365 ODB: Sync someone else’s docs

Well — sync that! Just click to the “sync” link on top as described before and the ODB client on your PC adds another folder to the ODB folders in Explorer. And those will show exactly what has been shared with you from that library. Not 100% perfect maybe, as it leaves you with having to know “who” shared “what” with you, but still a possibility to work around having to create a teamsite or working from within the browser, only, if you don’t want to.

Anybody out there knowing other options how to conveniently add shared files and folders to the local ODB folder tree? Please share your insight in a comment!

P.S. – what about the doc-lib links?

If in case you do not want to go by the “sync” link in the ODB portal to invoke ODB synchronization but want to add libraries within your ODB sync client on your PC, right-click to the ODB tray icon (the little blue cloud – not the white one, that’s OneDrive, formerly aka “SkyDrive” ;)) and click “Sync a new library”. And here’s what to use for the syncing options discussed above:

  1. Your own ODB library: https://<company-name>-my.sharepoint.com/personal/<your-username>_<domain>_<TLD>/Documents (where <your-username> is what you’re called in that O365 tenant, e.g. johndoe, and <domain> and <TLD> is what your company is called in that tenant, e.g. contoso.com – in that case it would be “johndoe_contoso_com“)
  2. Teamsite: https://<company-name>.sharepoint.com/<name-of-shared-documents-library> (which is depending on the initial language at the initial site setup; e.g. “Freigegebene%20Dokumente” in case setup was done in German
  3. Someone else’s ODB library: https://<company-name>-my.sharepoint.com/personal/<that-person’s-username>_<domain>_<TLD>/Documents – i.e.: instead of using your name as described in (1) above, you’ve just to exchange that with the username of that other person who’s library you want to sync

But I think, how to format all those links correctly for being used in the ODB client’s “Sync a new library” dialog has been already discussed all around the place in multiple posts on the web, anyway.

 

Published by:

3 things to consider when creating a new SaaS product – Automic blog post

You want to create a new product. At the same time you want to create a new delivery model – e.g. Software-as-a-Service (Saas) – for your new product.

Of course – these days – you also want to create a new pricing model for your product. And you want to increase delivery speed while maintaining product quality, of course, as high as it always was.

Ultimately you also want to keep the consistent flow of upgrades, patches, and hotfixes, for your existing enterprise product landscape intact.

Challenging? Yes. Impossible? No.

 

1. Deal proactively with the risk of technical debt

Creating something of this complexity within such a short timeframe can easily lead to artefacts being developed that play together well in the first place but never ever scale to the extent that an enterprise ready product would need to.

A clear architectural separation of concern between entities while at the same time keeping focus on a concise end-to-end architecture for all building blocks is key for avoiding technical debt right from the beginning.

One approach of achieving that focus is, of course, to invest heavily into upfront creation of architecture and design specifications.

However, this approach might just not serve the goal of a short time-to-market sufficiently.

Hence, the only way of maintaining the path and thereby reducing technical debt is to create just enough specs to form the boundaries within which a team of brilliant technologists can quickly develop the MVP – the minimal viable product – to be pushed out and at the same time stay focused on the path of the broader goal.

2. Be DevOps from the beginning

One might consider the creation of a new product within a new delivery model (like SaaS) to be just another product development.

Here at Automic we have a lean product development process in place based on agile patterns and already tailored to our customers’ needs with regards to fast change request and hotfix delivery.

However, approaching SaaS with this speed instantly surfaces the need of something new in that space. Hence – along with a concise architecture specification – you need to create not only a DevOps oriented tool chain but at the same time a DevOps culture between the involved organizational units.

DevOps – if implemented end-to-end – changes your delivery, maintenance and operations pipeline completely. Developers are challenged by the fact that their deliverables are instantly deployed to a QA system, test engineers change their focus from testing to end-2-end test automation in order to support automated production deployments and operations start to deal with an application centric rather than a system centric view onto their environments.

Setting the stage by creating a DevOps funnel from the very beginning is key to delivering not only the MVP but also its constant continuous enhancements.

3. reate a consistent architecture model of Automation and Orchestration

Having a solid enterprise ready SaaS-ified product in place is a major challenge in itself.  Creating a solid delivery framework of support services for operations and business processes clearly adds a significant level of complexity.

The cornerstone of this is a strong Automation layer defining its capabilities into clearly separated building blocks for the respective purposes (e.g. customer instance management, component packaging, user provisioning, etc.). Put them into the entities they clearly belong to.

Do not put capabilities (logic, functionality) into a building block or component that actually serves a different purpose. Create small functional entities within the Automation layer and orchestrate them into a support service for a well-defined purpose within the framework.

Holding these paradigms high during the minimal viable design process as well as during the rapid – somehow prototype-like – creation of the MVP will later allow you to decouple and re-bundle entities along the path of scaling your building blocks and your entire delivery framework. Of course, a strong Automation product tremendously eases achieving this goal.

Are you involved in creating a SaaS product? What have you learned from the experience? We’d be keen to get your thoughts in the comments below.

 

( This post was also published in the official Automic company blog: http://blog.automic.com/3-things-to-consider-when-creating-a-new-saas-product )

 

Published by:

Challenging Security

Security standards, guidelines, recommendations and audit instructions seem to evolve from nowhere just like weed wherever you’d need it the least. And – to share the bad news first: There’s no way out, no way to avoid any new standard created – at least not as soon as anybody in the field decided to adopt it. You’ll be second best instantly.

 

“The nice thing about standards is that you have so many to choose from”

says Andrew Tannenbaum.

 

I dived into security standards recently and got pretty bugged by the standards to choose from, hence, started to note things down in a structured manner and – well — dumped it here to re-find it (and to get your thoughts on it, to be honest … )

 

Some slight differences to know

There’s (security) standards, (security) reporting standards and (security) attestation standards.

ISO 27001 – oftenly quoted a “data center security standard” – is actually a process and control definition for information security matters in organizations dealing with information in the broadest possible sense. ISO27000.org  names it a “specification for an ISMS” (Information Security Management System). Actually it is the only real standard dealing with information security as such.

SOC (“Service Organization Control”) – e.g. – is a reporting standard specifying how an organization or a certified public accountant (CPA) would issue reports according to common other (security control) standards such as SSAE16 or AT Section 101.

Having said that, it is further important to understand that – e.g. – SAS70 (deprecated) or its replacement SSAE16 describe a standard for attesting controls at service organizations. In other words, these standards set the guidance for assessment on (a set of) controls which shall serve the purpose for an organization to adhere to (security – but not only security) regulations, both financially and technically.

Finally: By ensuring compliancy with the respective standard as well as reporting on the respective compliancy the organization at the same time proves (to itself as well as to customers) that it adheres to the standard, hence has and keeps a respective level of security and (technical or financial) compliancy.

It is a matter of fact – unfortunately, if I may say – that ensuring compliancy as well as reporting this ensurement follows myriads of guidelines and policies and Cloud/SaaS providers will most probably need a bunch of analgesics to get rid of their headaches again

I’ll gonna provide an analgesics starter package in the next few lines …:

 

Attestation Standards

SSAE16 – Statement on Standards for Attestation Engagements No. 16

  • replaces SAS70
  • is issued by the American Institute of Certified Public Accountants (AICPA)
  • has an international equivalent – the International Standard on Assurance Engagements – ISAE 3402
  • is a framework
  • requires service organization to provide a description of their system to control financial transactions
  • plus(!) a written assertion by management of the organization (which as a significant addon to the former SAS70)

A good summary on SSAE16 can be found here. Overview on ISAE 3402 is provided here.

AT Section 101

To put it very simple, AT Section 101 adds additional guidance to service organization outside the area of financial controls. Having said that, AT Section 101 actually creates value for customers when assessing their chosen service organization towards its capability and compliancy in the areas of

  • Security
  • Availability
  • Processing Integrity
  • Confidentiality
  • Privacy

The SSAE16 resource guide provides a comprehensible explanation of AT Section 101 here.

Trust Services Pricinples – in addition to AT Section 101 – describe the above principles in more detail. Comprehensible one-liners of these principles can be found here.

No question, there’s more. I wouldn’t have talked of a “myriads” otherwise; however, let’s keep it with those being most commonly talked about at the moment (please, do drop a comment if you feel, I’m missing one in this respect)

CSA CCM

The Cloud Security Alliance Cloud Control Matrix (CSA CCM) provides an addition to the before mentioned relating to information security tailored to the cloud industry. It is becoming increasingly common to add attestation according this standard to SOC 2 reports (see e.g. the Windows Azure Trust Center).

More on the CCM to find here

 

Reporting Standards

Let’s KISS – keep it simple and stupid: Recently what evolved to be THE reporting standard, is the (set of) Service Organization Control reports – or SOC reports. Their intention is to guide service organizations as well as certified public accountants (CPA) through how to compliantly report on a given standard.

SOC 1

  • is used to issue reports in accordance to SSAE16
  • can lead to SSAE16 Type 1 reports (reporting on the service organization’s control system itself)
  • or SSAE16 Type 2 reports (reporting on management’s description of the service organization’s control system)

It has – according to SOC1 Reports and SSAE16 (at the ssae16.org webpages) – become common understanding, not to speak of a SOC1 report but rather of a SSAE16 Type 1 or SSAE16 Type 2 report.

BUT: SSAE16 Type 1 and/or Type 2 is simply not enough … because:

SOC 2

  • is the standard to report on controls relevant to security, availability, processing integrity, confidentiality or privacy.
  • is conducted in accordance to AT Section 101
  • hence extends reporting of an organization’s control system on financial controls to those on the Trust Service Principles (see above).
  • can be issued as Type 1 or Type 2 report in the same way as SOC 1

Fair to say, therefore, that an organization NOT issueing and providing a report according SOC2 may not be claimed compliant with security constraints necessary for Cloud/SaaS provisioning.

SOC 3

is an addition for SOC 2 in accordance with the Trust Service Principles (see above). Scope of any SOC 3 based assurance engagement is essentially defined by the 5 Trust Service Principles (Security, Availability, Process Integrity, Confidentiality and Privacy) as stated further above.

SOC 3 in essence comes into place when neither SOC 1 nor SOC 2 nor additional security standards such as payment card regulations (PCI DSS or HIPAA Privacy Regulations) or the like are considered appropriate.

 

Finally: SarbOx

And why all that?

In 2002 – after facing significantly serious loss of trust with service organizations out of well-known bankruptcies and control system breakdowns – the US Congress passed the Sarbanes-Oxley Act into a law.

SarbOx – aka: SOX, SOA (what an unfortunate abbreviation!) or simply “the Act” – requires management’s certification over their financial results as well as management’s assertion on the effectiveness of an organization’s control system. Thus said, it somehow forms the basis for all evolved standards in the respective area. If interested in that even more boring (yet: important!) aspect of security, check out this -> http://ssae16.com/SSAE16_SOX404.html

 

So, truth is: There’s no way out

Having only walked through all the high level definitions of the mentioned standards and at the same time having understood the importance that analysts – and customers respectively – pose towards service organizations to assert their internal control system successfully, I reckon that there’s quite a way to go if you intend to become a trusted Cloud provider. So, actually there’s good news only for those, who’ve already started pathing their security way …

 

Related articles

 

Published by:
%d bloggers like this: