Contenu connexe Similaire à Red lambda FAQ's (20) Red lambda FAQ's1. 34 Frequently Asked Questions
about Red Lambda, Inc.
Red Lambda enables businesses and government agencies to effectively secure their
data through advanced, Big Data analytics technologies that break through the
barriers and limitations of existing legacy systems and appliance-based offerings. Red
Lambda’s seamlessly integrated suite of solutions, powered by its massively scalable
distributed grid platform called MetaGridTM, fuses virtual supercomputing,
relational stream processing and artificial intelligence for the first time into one
complete system, enabling real time, on-the-fly anomaly detection for known and
unknown threats.
The system’s predictive capabilities deliver unprecedented visibility and actionable
intelligence that makes sense of structured and unstructured data without rules,
signatures or manual programming. By empowering end users, companies can
deploy preemptive strategies to confidently defend against cyber attacks, while
deriving significant business value from their operational data.
1. Is security and operations where Red Lambda is starting?
Yes. The core technology of the grid platform and the analytics engine, which is called
Neural Foam™, uses generalized algorithms that can be applied to any broad scale
computing, business intelligence, or data mining task. They can be used in a traditional
way, such as analyzing a customer database, or applied to more forward-looking
applications involving many disparate data sources—for example, analyzing social media
and incorporating that into trend analysis. We focused on security and network
operations out of the gate because our team has such deep talent there. Our customers
immediately saw exciting potential in other areas, and we explored them.
2. What is Neural Foam?
Neural Foam is our artificial intelligence engine, which powers analytics in MetaGrid.
Neural Foam is a universal, automatic data-mining engine, meaning it can be applied to
any kind of data. Neural Foam discovers meaningful patterns, anomalies, and
correlations, without prior knowledge or training period. It operates automatically, so you
really just have a quality and performance trade-off in terms of tuning the software.
© 2014 Red Lambda, Inc. All Rights Reserved.
2. 3. What kind of expertise is required to operate? Do you need to
be a data scientist?
Neural Foam democratizes data mining and requires no data mining experience. Our
mission is for customers to get productive, actionable results immediately, not focus on
getting a PhD in computer science.
4. Does MetaGrid only look at a current or recent events, or are
you looking at things that may have happened over a long
period of time, say, over a number of years?
MetaGrid does both, analyzing events and time series simultaneously. It could be used to
analyze a year’s worth of pricing data to determine buying habits or to look for advanced
persistent threats (APTs) where people have been methodically infiltrating an
organization over a long period of time. In fact, most Fortune 1000 companies are
targeting APTs explicitly. Simultaneously, the business intelligence implications are
enormous because MetaGrid makes no assumptions about what time periods might be
important. It just finds them.
5. How long is the timeline of a real advanced persistent
threat?
Potentially years. If there is a rule to advanced persistent threats, it’s that the timeline is
probably a lot longer than you think. Attackers know that the ability to find an indicator
of compromise in years of logs or traffic is something that no existing vendor can address.
Our software is built from the ground up for those kinds of extraordinary scale situations,
so it’s a perfect fit for Red Lambda.
6. Can you take data you have never seen before and
immediately start using it?
Absolutely. We had an opportunity to brief an important analyst. Anyone who has ever
been to an analyst briefing at a conference knows there’s a bit of a formula–you have five
minutes to talk, ten minutes for questions, and then they’re on to the next vendor.
Instead, the analyst spent an extra hour-and-a-half with us and gave us data we had
never seen from his own research. MetaGrid analyzed that data and he immediately got
results that really blew him away. It was fascinating because he found things he had never
seen. He mentioned that usually when you think of neural networks, you think of armies
of academics trying to make sense of your data, whereas MetaGrid just does it
automatically.
2 © 2014 Red Lambda, Inc. All Rights Reserved.
3. 7. Red Lambda “takes scale, speed and storage off the table”.
What do you mean by that and how do other vendors approach
the problem?
It means computing as fast as you need to at any size by using our grid computing break-throughs.
If you think about how other vendors approach problems, it starts with how
much horsepower they have in the devices that they are working with, what algorithms
can be applied based on that horsepower, and how much data can really be analyzed.
These are all scaling questions. Instead, we focus on the best method for solving the
problem because we know we can solve the scaling issue. That’s a huge competitive
advantage. While other vendors force you to replace your infrastructure periodically, ours
just gets more powerful organically as shared infrastructure is upgraded.
8. How does this idea of unlimited scale apply to security?
The goal of the security industry must be to analyze anything, anywhere, any time if a new
generation of security is to begin. Only MetaGrid can do that. Currently, the goal of many
typical security solutions is to gather and filter data as rapidly as possible, throwing out
things “someone” thinks aren’t important. Unfortunately, signatures created in advance
drive most security products, such as anti-virus, intrusion prevention, firewalls, and
SIEMs. There is no point in analyzing data that doesn’t have a signature because it creates
a ton of performance challenges when you are limited to the speed of single devices.
Attackers have exploited this dangerous assumption for years to simply hide in plain
sight.
9. Can you explain that? How can the system detect things like
zero-day events if it’s not using rules or signatures?
Neural Foam makes this possible by incrementally discovering all patterns, anomalies,
and correlations and automatically measuring similarity to new things. Take a
polymorphic or self-modifying virus as an example. Anti-virus vendors create an explicit
signature for every version of a virus and have heuristics that can sometimes discover
unseen variants if the variation happens in a known way. But Neural Foam finds all
variants, regardless of how it modifies itself. If the functionality is similar, Neural Foam
not only finds the new variant, but also knows how similar it is automatically without any
heuristics or assumptions. We believe assumptions are the root of most security evils.
3 © 2014 Red Lambda, Inc. All Rights Reserved.
4. 10. Once you identify the anomaly using Neural Foam, then
how does the system take action on that anomaly across the
enterprise?
Since MetaGrid is able to interact with the infrastructure directly, there is an incredible
opportunity to automate policy after detection. We’re able to use the grid to automate
much of the process life cycle, such as quarantine, mitigation, and notification. Taken
together, you have a scale-free computing engine that can analyze any data and
automate response. This is a very potent platform, not just for security and operations
but also for broader scale business intelligence and automation.
11. How does MetaGrid actually do all this?
Everything begins with the MetaGrid computing platform. It was built from the ground up
to support global-scale computing. It is a dynamic, event-driven stream processing
system, which means everything in the system is computed continuously as it operates.
The grid platform gives you all the power and control found on a single system even
though it can use every computer on the planet as one. It gives you centralized memory
and a central file system. It looks and feels and acts, as far as the application knows, like a
single computer. Under the hood, it is dynamically load balancing event-by-event,
moving processing adaptively around the grid. Essentially, computation lives on the grid
as a mobile process. There is only a single piece of software and we have used this
successfully on grids of systems from smart phones to super computers.
12. The system was inspired by peer-to-peer (P2P) file-sharing
systems. Why was it modeled after this type of system?
Yes, the system was inspired by peer-to-peer file sharing. We looked at P2P systems and
said, “Those are pretty impressive. There are tens of millions of people using them at any
given time. They are virtually impossible to take down, but with all those qualities, all they
really bring to the table is the ability to store and search for files. Wouldn’t it be amazing
to turn that into a general-purpose computer, one capable of supporting a global com-puting
ecosystem?” That’s exactly what we did with our grid technology.
13. What type of computing model do you use? Does it replace
things like MapReduce or MPI?
MetaGrid computes over graphs of services. It is Turing complete and can do anything
that can be done on a single computer, cluster, or supercomputer. MapReduce, MPI, and
other parallel computing metaphors are merely a subset of its functionality. Anything you
can do in those architectures you can do on MetaGrid.
4 © 2014 Red Lambda, Inc. All Rights Reserved.
5. 14. Does a user have to re-architect their code to benefit from
the scalability of MetaGrid?
No. In fact, MetaGrid is designed to scale code that has not specifically been written for
scale. It can take applications, APIs, or even scripts not built for massively parallel
processing and give it the necessary scale without having to re-architect the entire
underlying code base. This is what we call “throughput supercomputing.”
15. How does the system handle all the disparate datasets
throughout an organization?
We know that companies don’t just have one giant monolithic data set. Instead, they
have many data sets in a variety of forms and locations; lots of data, little and big, is
created and stored throughout the enterprise. Bringing all that data together into a
common platform and computing on it in a seamless fashion is what makes MetaGrid
such a powerful solution. MetaGrid has services that ingest or retrieve virtually any
dataset, index it for search, analyze it using Neural Foam, store it, and take action in
response. MetaGrid brings a simple, common workflow to all datasets, large or small.
16. How did you make a large, distributed system manageable?
MetaGrid has a “self-everything” architecture. It’s self-organizing, self-optimizing, and
self-healing, event-by-event. The goal from day one was to make large distributed
systems easy to manage. Currently, it’s very difficult to manage physically distributed
systems. Even simple changes can be very complicated, spanning long periods of time,
perhaps even weeks. We tackled these issues directly at the architectural level, making
computing more resilient so that the tasks of management are actually practical for the
kind of extraordinary scale that we are discussing.
17. Would all of the security devices in an enterprise feed their
data into MetaGrid?
Yes, data from everything, from the network infrastructure to things like firewalls,
intrusion systems, and application servers. MetaGrid eventually becomes a centralized
collection of analytics data and automation within the environment. This is critical to
the needs of security because situational awareness requires visibility at all levels and
coordinated response. There can be no dark corners. The needs of security overlap with
the needs of business intelligence since proactive and actionable intelligence is the key to
a firm’s competitive advantage.
5 © 2014 Red Lambda, Inc. All Rights Reserved.
6. 18. Is MetaGrid a Software as a Service (SaaS) that you provide
or do clients purchase it for site installation?
MetaGrid is for building private grids, so this is a complete architecture at the customer’s
site, using the customer’s infrastructure. There is no global MetaGrid that this attaches
to, for example, that provides community capacity. Customers don’t want to ship their
security data offsite, and often times, the data is far too fast moving to move offsite in any
case. The goal is to use systems in place in the environment in a geographically
distributed way and create a compute grid out of those resources to maximize the value
of the data and the hardware investments.
19. Can you explain the underlying architecture of MetaGrid?
The interesting thing about MetaGrid is that it’s not a number of different open-sourced
projects cobbled together. MetaGrid is a complete platform that includes compute, file
system, relational storage, event storage, indexing, and analytics, built from the ground
up in a highly modular architecture for real-time processing. MetaGrid pre-dates
Hadoop and was built to tackle the much broader problem of ubiquitous computing.
We’ve advanced far beyond what those open-source solutions are capable of doing and
as a result, have quite a broad suite of intellectual property.
20. Why isn’t MetaGrid an open source technology?
MetaGrid always embraces open source solutions when appropriate. Over the years, we
tried many open source projects for underlying components and some survive to this day.
Frequently, however, we have quickly out-scaled those projects. In fact, the most recent
example is the Apache Lucene search engine we replaced. We found very quickly that
Lucene was just inadequate given the data rates our customers experience in real-time.
We removed Lucene and replaced it with an in-house system that is three times faster
and results in three times less storage. This example speaks to the nature and complexity
of real-time processing: you need a different architecture. It’s not enough to just bandage,
patch, or port an existing architecture because a lot of the assumptions that have been
made in that data or in those particular methodologies just don’t work. It’s like trying to
turn an airplane into the Space Shuttle.
6 © 2014 Red Lambda, Inc. All Rights Reserved.
7. 21. Can third-party applications be used with the grid?
Yes, data from everything, from the network infrastructure to things like firewalls,
intrusion systems, and application servers. MetaGrid eventually becomes a centralized
collection of analytics data and automation within the environment. This is critical to
the needs of security because situational awareness requires visibility at all levels and
coordinated response. There can be no dark corners. The needs of security overlap with
the needs of business intelligence since proactive and actionable intelligence is the key to
a firm competitive advantage.
22. How does MetaGrid ensure resources are fairly shared?
MetaGrid was built specifically for many users to share the grid’s power at the same time.
To this end, MetaGrid automatically prioritizes, optimizes, and load balances all
processing. It doesn’t make sense to consolidate so many resources only to dedicate it to
single problems. For comparison, on Hadoop, if you have two different people running
jobs at the same time and they run over each other, there is no fair load-balancing or
sharing of resources. It’s a first come, first serve model.
23. Do all the services on the grid platform need to be written
in Java?
No. MetaGrid uses a service-oriented model, so services can be written in virtually any
language. You can have things written in scripting languages, Java, C, C++, Lisp, etc., and
they can all participate as part of a common workflow or what we call “jobs.” The grid
orchestrates services, making their language invisible to the application. This solution
enables tremendous compatibility and flexibility.
24. Can MetaGrid do full text indexing and storage?
Yes, MetaGrid can index all data to support search. Furthermore, it’s data agnostic, unlike
most other products. Many popular search tools assume that everything is space
delimited English text, so they’re not consuming Chinese language, for example, or
Arabic. We’re designed to consume and analyze any form of data.
7 © 2014 Red Lambda, Inc. All Rights Reserved.
8. 25. How is the MetaGrid event-processing model different from
traditional event-processing?
The traditional event-processing model gathers data, stores it in a database, and then
performs batch analytics or triggers complex queries – is in use by virtually all big data
vendors. Most of them consider this model to be real time, even though the data has
already come to rest or is processed in batches. From our perspective this approach is
little more than a forensic activity. As soon as the data comes to rest, it’s stale and is no
longer real time. The reason vendors do this is simple: it’s easier for them to rely on hard
disks as a crutch when they don’t have enough computing power, at the expense of
latency and actionable results.
In contrast, while MetaGrid can perform such forensic analysis, it is also a true real-time
processing system for data at global scale. Every component, from Neural Foam to
indexing to the core processing architecture, has been designed to work on data before it
ever comes to rest. Results are found in-stream and flow directly to visualization or trigger
automation. Storage is for storage, not to compensate for too much data to fit on a single
computer.
26. How does MetaGrid deal with stream processing data across
geographical correlations, and do that in real time?
First, I want to clarify the concept of “stream.” For us, a stream is just a way to logically
organize your events. For example, at a large global infrastructure provider, they have one
stream that is called “logs,” which has all of the log data from 25 different vendor
products (approximately 30,000 devices). Streams are not focused on the specific data
source or location. If I wanted to analyze all the data in my log stream to look for
correlations, I wouldn’t have to break the streams out into 25 separate streams, one for
each vendor. I can analyze all the data collectively as a single stream of events even if the
data isn’t from the same source.
MetaGrid’s analytics are fully distributed and incremental. There is obviously some magic
under the hood for things like join propagation and query routing for traditional complex
event processing, but when you’re talking about incremental analytics, Neural Foam
makes this possible. Neural Foam is built for incremental analytics where you need to
perform data mining event by event. Using Neural Foam, you can actually throw the raw
data away directly after it has passed through the analytics engine and still retain all the
intelligence needed for correlation without storing it.
8 © 2014 Red Lambda, Inc. All Rights Reserved.
9. 27. In many instances, your customers would already have a
SIEM and other security tools as data sources. Can Red Lambda
use these? Do you need these devices to get to the raw data?
We can certainly use SIEMs as a data source, providing rich classifications to events. We
can feed off of that information and make use of the data as more metadata makes the
foam smarter. That said, a lot of customers have used SIEMs and they are painfully aware
of their scaling limitations, so it doesn’t make sense to have SIEMs bottleneck MetaGrid
by sitting in front of events. Instead, it’s better to run them beside MetaGrid so they do not
limit performance but still provide benefit.
Think of MetaGrid as PacMan. We want to gobble up as much context and data as
possible from as many places as we can, and we will take that data however we can get
it. When we can consume all of the data, we can draw the deep correlations into what is
happening.
28. What is the foundation of Neural Foam?
Neural Foam is based on fundamental breakthroughs in operationalizing artificial
intelligence and algorithmic information theory. It is based on some spectacular
theoretical math that no one knew how to use practically. One of our interesting
breakthroughs with Neural Foam was applying that theoretically perfect solution in
practice to real world data. Because it is based on compression, it is effective on any form
of binary data, and I really do mean any data.
29. What are the key areas of functionality in MetaGrid?
There are four key pillars of the MetaGrid analytics and correlation suite.
First is clustering, which is essentially finding the haystacks of data, the natural groupings
inside it. The task of finding a family tree, or phylogeny, in DNA is the same task. When
Neural Foam first absorbs data, it tries to find those haystacks. We are guaranteed to find
every anomaly and we’re guaranteed to find every cluster and every pattern.
9 © 2014 Red Lambda, Inc. All Rights Reserved.
10. Second, it finds the correlations, which are the relationships across the data sets, and
those can come in many different forms. They can come from the context of the
information, the time window, or pattern matches. But the real key here is that, again,
because of some of the breakthroughs in math, we’re able to guarantee the discovery
of all correlations in that data when you use maximum resolution—which means that if
there is something that is contained in the data, Neural Foam is guaranteed to discover it.
Third is classification, which is understanding unknown events based on what we already
know. As operators use MetaGrid, it learns from their classifications and prioritization of
events. In turn, Neural Foam uses this to suggest what new, unseen events are.
Classification aids dramatically in operations by taking advantage of community
knowledge to better understand what’s happening.
10 © 2014 Red Lambda, Inc. All Rights Reserved.
11. Fourth is anomaly detection, which is discovering the needles in the haystack. Rather
than basing this on statistical assumptions, MetaGrid analyzes every subpattern of every
length, over every window to completely resolve all anomalous patterns. Not only are the
results intuitive, in which most things in an environment are not anomalous at all, they
also discover subtle anomalies down to a single bit. By not being based on statistical
distribution assumptions, this solution delivers the best results instead of shooting in the
dark.
30. Does Neural Foam do zero day detection of emerging
threats?
Yes. Neural Foam discovers all new patterns, trends, or specific exploits that differ by as
little as a single bit. The most interesting thing about Neural Foam is that it can find the
indicators of compromise in any of the data available – logs, traffic, network info, etc.
Neural Foam can find things when no other tool can because it makes no assumptions
about what might be important.
31. How does the grid figure out which services to send events
to?
If one of the kernels on the grid gets busy, it automatically load balances its work to other
kernels on the grid. The load balancing is completely dynamic. It happens event by event
and that keeps things very simple and survivable.
11 © 2014 Red Lambda, Inc. All Rights Reserved.
12. 32. How do you manage the grid or visualize results? Is there a
GUI?
MetaGrid has a console UI that enables any number of people to use the grid. Results from
analysis stream directly into the UI, rather than passing through a central server, making
the UI real-time just like the grid. Numerous charting and reporting features are built into
the UI, covering traditional visualization to advanced visualizations in support of Neural
Foam. Additionally, all management of the grid is done via the UI, making MetaGrid far
simpler to install and maintain than any distributed computing solution.
33. How can MetaGrid fit into a company’s existing
organizational structures?
MetaGrid melds smoothly into existing environments by supporting the real purpose
behind data mining: obtaining actionable results. At one of our customer sites, they have a
team of data scientists. To capture one of the gentlemen’s responses, he said, “You know,
this is really convenient. You’ve solved one of the problems I’ve been trying to solve for the
last 10 years. Now I can get back to using results instead of finding them.” We get this a lot.
People are tired of academic analytical tools; they’re tired of research projects.
34. It sounds like you have security pretty well covered. Can the
same data gathered for security be leveraged for business
intelligence purposes?
Yes. We believe the same method has very broad BI implications as well. We actually find
some interesting situations where security is piquing the interest of marketing, for example,
because they’re seeing trends hours before marketing does with their Google analytics
account. Frankly, the results have astonished us when applying MetaGrid to other things
besides security.
When people used to use data mining tools, it required a large amount of fine-tuning to get
decent results. When we created the first prototype of Neural Foam, the results were jaw
dropping. At first, we assumed we must have missed something because the results looked
so good. The mathematical discoveries enable us to make inferences that
computer science couldn’t before and that’s simply how business intelligence can be
approached. It’s no longer about thousands of manually defined business rules or
mapping of processes; it’s about auto-discovery and dynamically adapting to the
environment and real world processes at work. Consider the clumsy, static way traditional
BI systems handle variations in steps of processes during modeling. Compare that with
MetaGrid’s ability to optimally predict and adapt those variations and you’ll understand
the difference.
12 © 2014 Red Lambda, Inc. All Rights Reserved.
13. Closing Thoughts
Many vendors liken big data analytics to oil exploration or gold mining, in which you hope
to find something after a long and protracted data science effort. We feel that current big
data vendors are doing a disservice to their customers by maintaining this myth. We know
we can take raw data, put it into our system, and get great results.. We think this
fundamentally changes the landscape in analytics by demystifying it.
We believe we will change the way people approach big data in general because of the
breakthroughs made in our products. You hear various descriptions of big data: velocity,
variety, and volume. At the core, it’s the variety that kills you in large environments because
that’s what creates the computational explosion problems in the query-based analytics
used by other vendors. If you’re trying to build up discovery queries and your organization
has 14,000 different applications with different logging formats, that is a nightmare for
other vendors. You have to build up query sets for all the different applications and the
way that they might interact with each other – an intractable problem. Our product can
find those correlations without needing queries, and it avoids the combinatorial explosion
problem. Nobody else can do that.
From our perspective, if you’re a domain expert that has looked at a data set for years, and
you buy a tool that requires an army of PhDs just to be effective, that’s a bad tool, not a
justification for new positions. If you have a tool that you can effectively use right out of the
gate and at any scale, that changes the game entirely. Hold on tight, it’s here now.
13 © 2014 Red Lambda, Inc. All Rights Reserved.
14. © 2014 Red Lambda, Inc. All Rights Reserved.
v.16062014
Red Lambda, Inc.
Phone: +1.407.682.1894
Fax: +1.718.247.1852
Corporate Headquarters
2180 West State Road 434
Suite 6200
Longwood, Florida 32779
London
1 Royal Exchange Avenue
London, EC3V 3LT
United Kingdom