Velocity for Big Data

"Big Data" is a term that has emerged in recent years to describe a new set of data management challenges resulting from data sets so large, diverse and complex that they defy conventional methods of data management and analysis. Add to this the constant flow of new changing data and the challenges become greater. But with these challenges come huge payoffs for organizations that are able to exploit big data more effectively than their competitors.

Some examples of big data applications include:

  • Intelligence gathering and analysis
  • Exploitation and fusion of geospatial, temporal and sensor data
  • Marketing and sales analytics
  • Operational analysis and modeling

Vivisimo has been helping leading organizations to exploit the value of extremely large and diverse data sets since before the term “big data” was coined. Our Velocity Information Optimization Platform makes this possible by providing reliable, field-proven solutions across the full big data lifecycle, including data access, transformation, indexing, search, analysis and visualization.



The challenges of big data can be summarized in four words: Volume, Velocity, Variety and Variability. Vivisimo’s Velocity Information Optimization Platform addresses the challenges of big data on all four of these critical dimensions.

Volume

The first challenge that organizations face when dealing with big data is volume, or the sheer amount of data. The Velocity Platform has been used to index trillions of records and petabytes of data, and provides a variety of strategies to accommodate large data volumes, including:

  • Scaling “out” to large numbers of servers with easy administration on commodity hardware
  • Scaling “up” through parallelized data collection and indexing tasks, as well as index compression
  • Master-master replication and failover that ensures that any server node can take over tasks if needed

Velocity

The challenge of velocity is the need to handle the speed with which new data is created and existing data modified. New and modified data often must be available immediately upon creation for searching and analytics. The Velocity Platform was designed to handle large amounts of new and updated data flowing into the system by providing:

  • Real-time text and metadata analysis
  • Field-level updates to existing data, including user tags
  • Real-time activity streams delivered in context to users and client applications
  • Immediate incorporation of user activity data and feedback for use by the system

Variety

Big data implementations require handling of a variety of data formats and types, including structured, semi-structured and unstructured, as well as the special demands of rich media and transactional data. The Velocity Platform is able to analyze and ingest all of these types of data and fuse them in search results and analytics to yield insights that could not otherwise be achieved. Velocity’s relevance model accommodates diverse document sizes and formats while delivering consistent results. Key Velocity features to support this requirement include:

  • "Schema-less" indexing and searching
  • Rich analytics, including clustering, conceptual search, name matching and de-duplication

Variability

Along with the variety of data formats, organizations that seek to exploit big data will encounter extreme variability, including diverse security models, metadata schemas, application interfaces, and hosting options. The Velocity Platform accommodates this variability through:

  • Broad connectivity to a wide range of data management systems and applications
  • Sophisticated security mapping, including cross-domain and field-level security
  • Support for virtual documents created from multiple sources or tables
  • Federated connectivity in the cloud and on-premise

Watch Big Data videos:

  Meeting the Challenges of Big Data (Video Podcast)