Presenting Fuse 1.2: Clustered Usage Statistics

Usage statistics are critical to understanding the impact data apps have on your organization. They allow you to understand and increase the value of data by gaining insight into how people interact with it. They also allow you to meet your security, regulatory, and compliance requirements.

So we’re happy to announce today that Fuse 1.2 includes usage statistics for both a single Fuse deployment and aggregate statistics across a clustered deployment. We’ve also worked on a lot of other exciting improvements like dockerizing Fuse, streamlining clustered deployments, providing an entity extraction service, and more. Here are the highlights…

Usage Statistics

When deciding on an architecture for our usage statistics, we had to take into account the need for analytical and search capabilities along with aggregating usage from a clustered environment. Fortunately, we have a platform that provides both highly interactive search and analysis, along with the capability to harvest data from any number of sources.

When enabling usage statistics, an independent Fuse instance is deployed providing the full range of capabilities our product offers. The data collected can be categorized in the following groups:

History Data – information about search queries
Error Data – details about errors that occurred while the instance was running
Monitoring Data – server monitoring data (cpu, hard disk, etc.) along with index and content data

Over 50 fields of data are collected with over 20 facets available for search and analysis. Some examples are; user, query, number of facets used, search time, active connections, and index terms.

statistics documentation

Clustered Architecture

We’ve improved the stability and ease of deploying and scaling a clustered Fuse setup. To accomplish this, we’ve implemented a simple two-phase commit protocol leveraging Consul along with built-in transactions for LevelDB. This implementation also provides a central access point for health checks and configuration and seamlessly handles the replication of settings and query extensions.

Entity Extraction

Fuse is specially crafted for exceptional speed and flexibility when processing large amounts of structured data. To compliment this, we’ve developed an entity extraction service that allows pattern matching of entities for extraction when structured or meta-data doesn’t exist. The service populates facets with extracted data either during ingest or against an already populated index.

entity extraction documentation

Dockerized Deployment

Docker is awesome and it makes things simple. Fuse is awesome and it makes things simple. So it was only natural that we dockerized Fuse. Now you can download our Fuse Docker image from our Docker registry and be up and running in minutes.

Docker installation documenation

Other Changes

Schema validation – more detailed validation and responses
Manager lite – administrative interface for basic operations
FuseLink JavaScript SDK – build data-intensive front ends quickly (read more)
Consistency improvements – reviewed conventions and ensured they were consistent

Upcoming

We’re already working on our upcoming release of Fuse, and there is some really exciting news to share. Here’s what we can reveal right now:

Statistical functions – summing, averaging, distributions… our analytical power is starting to shine
Full-featured admin interface – simplify deployment and management
Community – let’s chat and make cool things
Easy access to Fuse – can’t talk about this one yet, but Fuse will be very easy to attain…

In Closing

This release bridged a big gap that our customers needed filled. We’re absolutely thrilled with the dedication our team had getting these changes done well and fast. Our next release is going to be a bombshell, so be ready for big announcements toward the end of 2015!