RejuvenApptor™: technical overview



Dynamic approach to applications discovery

Enterprise IT environments are complex and constantly changing with millions of inter-dependent components. Any attempt to map and analyse enterprise applications that relies on manual labor will always be inaccurate due to the potential for human error and the lag time between when the data is collected to when the analysis is done. RejuvenApptor™ was designed to continuously map and analyze the topologies of the business applications/services:
  1. Auto-collect the data from servers, VMs, containers, clouds.
  2. Auto-identify the business applications/services, their topological structure/dependencies, data, unused assets, migration groups (to uncover even the forgotten and misinterpreted IT knowledge).
  3. Verify and augment this mapping with IT staff knowledge.

Two applications share DB instance
Example of an automatically-generated application diagram. All the decisions about what to show and what not are done by RejuvenApptor with no manual involvement.



RejuvenApptor



Distinguishing features

Feature RejuvenApptor™ Others
Automated business applications mapping and logical diagramming:
Software (applications) to server mapping. YesYes
Process-to-process or server-to-server connectivity diagramming. YesYes
Business application architecture and logical dependecies diagramming with minimal or no IT staff interviews. YesNo
Quality application topology models (accurate enough for non-trivial automated topological analysis):
Middleware objects level connections discovery (e.g., what database inside of an instance is used by an application). YesSome
Ultra-deep software models (e.g., modeling Job schedulers at the individual Job dependencies level). YesNo
Deep software models (e.g., modeling individual databases and their tablespaces). YesSome
Shallow software models that identify software installations, their version, vendor, and location. YesSome
Generic software models (to identify custom and rare software out-of-the box). YesNo
100% active software detection in a short time-frame (including custom software). YesNo
Classification of every dependency by type (e.g., IT infrastructure, NAS, etc.). YesNo
Automated application dependencies analysis algorithms for:
Identification of unused IT assets. YesNo
Migrations planning. YesNo
Client data classification (for security and BI). YesNo
Reliability and availability analysis. YesNo
Enterprise software licensing analysis. YesNo
Services-friendly fast and complete deployment that relies on the existing client tools and access policies. YesNo



Fast data collection process

Fast rollout: our record is 7 days from the first client call to the report delivery time for a datacenter with 100s of servers.


100% Penetration: for most clients we cover even the servers off-the domain, behind firewalls, and otherwise hard to reach.


Our data collection flexibility results in the following advantages:


No network traffic is generated during most data collectins.


No new accounts are required.


No new firewall exceptions are required.


No middleware-specific accounts are required in most cases.


No network scanning is required.



Supported platforms


Servers:


Windows 2000 and later, Hyper-V, and same generations of Windows desktop OSs.


Linux (including zLinux) - all major distributions.


ESX/ESXi v3 and above (32 and 64 bit)


AIX v4 and above (POWER and Intel)


HP-UX v10 and above (PA-RISC and Intel)


Solaris v8 and above (SPARC and Intel)


FreeBSD v4 and above


Tru64 (OSF1)


OSX v10 and above


OpenVMS



Notable OS discovery features:
  • OS support includes:
    OS-specific file systems, mounted and shared file systems, Logical Volume Managers, HBA/SAN/NAS/iSCSI interfaces and their mapping to disks, devices,
    CPU information (model, vendor, cores, threads, speed, virtual CPU pooling and pinning),
    memory, swap,
    processes, installed packages, bugfixes,
    users, groups, AD roles, etc.
  • Correct discovery of CPU details on old systems (many CPUs report incorrect information about their cores/threads that most discovery tools happily incorrectly report it turn.)
  • Utilization capturing over a period of time:
    CPU (including I/O wait time on some OSs)
    Memory and swap
    I/O (interfaces and disks, volume and IOPs)
  • Long-term connections monitoring: sounds trivial but most tools (especially agent-less) capture network connections at one point in time only.
    The diagram below shows the typical percentage of application topological dependencies observed as a function of the monitoring time.

    Application topological dependecies observed over time.


SAN, NAS, Networking devices:


There is experimental support for deep configuration and topology discovery on most popular NAS, SAN, and networking devices.




Software discovery

Ultra-deep models


Job schedulers: IBM TWS, CA Autosys, Microsoft System Center Orchestrator
Job scheduling systems orchestrate applications on hundreds or thousands of servers. Knowing that a server has a job scheduling agent running and is communicating with a centralized scheduling server does not provide any insight into the application logic. We model Job schedulers at the level of job dependencies on the different servers so can see job-driven logical application dependencies.


Service Managers: Microsoft SCSM, CherWell ITSM
Service Managers contain valuable information about the IT operations. We can recognize work items and service catalogs to map Operational information to applications.


SAP: SAP deployments are like hidden datacenters
We automatically analyze SAP configurations at the level of custom SAP programs logical dependencies and can run our automated algorithms for their analysis (e.g., to identify SAP programs and their dependencies, identify unused SAP components, partition the configurations for migrations, etc.).

Auto-generated SAP application diagram
Example auto-generated diagram illustrating a SAP ABAP code fragment performing a business function: a task performed periodically that fetches data from a remote server via a web URL and uses a local file for logging. SAP deployments contain hundreds of such custom-created programs that we auto-identify.

Deep models
Deep models provide internal software details like tablespaces for databases, deployed program modules and URLs for application servers, queues for messaging middleware, etc.

Virtualization Systems


Clusters


Databases


Web and application servers


Messaging middleware


Email Servers


Local and Network File Systems


Local and Network Block Devices


Shallow models


Other common software (applications and middleware): Shallow models extract software name, location, and commonly version and vendor.


Generic models


Custom and rare software: Due to the diversity of the software systems no amount of signatures and models will be enough to handle them all. We do not ignore unrecognized software like the other application mapping vendors do - generic models discover generic software information if it is running but no specialized models are available out-of-the box.

Moreover, we have developed technologies which enable us to semi-automatically identify and add models of rare and custom components within a few minutes per software type. As a result, we can quickly approach 100% software identification rates for each client with minimal amount of work.


Graph: Software distribution
Typical distribution of the total number of servers that host each unique software product. Most software systems are used on only one or two servers.





Frequently asked questions

The data collection does not involve any memory-heavy operations like running a java VM - only the basic shell commands are involved.
The process is started with low priority. Therefore, any production activity on the servers gets CPU time first and is not impacted. However, on an idle system the data collection process may cause significant CPU overheads during the first seconds after its start, which is not a big deal since the system is idle.

We can typically use the available data from the existing tools and spreadsheets. However, the data modeling and collection quality of the existing tools is typically not enough to run automated graph-analysis algorithms and produce useful results. Such data requires extensive manual labor and manual analysis to be useful. So we typically have to run our data collection too.

Yes.