Expedition Technology Named as Finalist for SECAF’s Prestigious Government Contractor of the Year Award

April 12, 2019, McLean, VA – Expedition Technology today announced that it was selected as a finalist for the 11th Annual Small and Emerging Contractors Advisory Forum (SECAF) Awards. Winners will be announced at the SECAF Awards Gala on Thursday, May 2, 2019 at the Hilton McLean in Tysons Corner. The event honors small and emerging government contractors and the players the industry that rely on small business.

2019 finalists are named in the following categories:

  • Government Contractor of the Year (Under $7.5 Million in Revenue)
  • Government Contractor of the Year ($7.5 to $15 Million in Revenue)
  • Government Contractor of the Year ($15 to $27.5 Million in Revenue)
  • Government Contractor of the Year ($27.5 to $50 Million in Revenue)
  • Award of Excellence
  • Government Project of the Year
  • Mentor-Protégé Program of the Year

Expedition Technology is a finalist for Government Contractor of the Year ($7.5 to $15 Million in Revenue).

“Expedition Technology is a pioneer in machine learning and artificial intelligence for defense and intelligence C4ISR applications. By recognizing the dramatic advantages of AI before many, Expedition was able to establish a first-mover position in a rapidly growing market, resulting in a nearly five-fold increase in revenue over three years, says Marc Harlacher, CEO & President. “We deeply appreciate being recognized by SECAF once again and look forward to continuing our relationship as we grow.”

“For more than a decade, the SECAF Awards Gala has been regarded as the consummate awards event honoring the businesses delivering exceptional levels of quality in the government contracting community,” said Curt Anderson, Vice President of Corporate Growth at Strategic Resolution Experts, Inc. and SECAF Board Chair. “It is a privilege and my distinct honor to recognize Expedition Technology as a tremendous example of leading by serving others. The organization demonstrates the discipline required in the face of adversity, the absolute can-do attitude, and the commitment to our customers’ missions which enable our government contractor ecosystem to reach higher levels of growth, maturity, and success.”

# # #

About Expedition Technology, Inc.

At EXP, our mission is to research, design, develop and deploy advanced signal and image processing solutions for our customer’s most demanding Command, Control, Communications, Computers, Intelligence, Surveillance and Reconnaissance (C4ISR) problems. We bring together a talented, multi-disciplinary team of machine learning researchers and engineers to craft innovative analytical capabilities and enable autonomous systems. To operate in an increasingly complex environment, our success depends on our ability to obtain accurate insight and to provide awareness wherever and whenever necessary.

About The 11th Annual SECAF Awards Gala

The 11th Annual SECAF Awards Gala is the premier commemorative event honoring the small and emerging government contractors and the players in the ecosystem that rely on small business. The event, expected to sell out at more than 500 attendees, will be held Thursday, May 2, 2019 from 6:00pm to 9:00pm at the Hilton McLean in Tysons Corner. Tickets and tables can be purchased at: www.secaf.org.

About The Small and Emerging Contractors Advisory Forum

The Small and Emerging Contractors Advisory Forum (SECAF) enables the small and emerging government contractor to achieve maximum growth rates in a highly competitive marketplace. Providing members with business resources, access to influencers and government agencies, and advocacy opportunities and education, SECAF is an important resource for a growing company.  SECAF also serves medium to large government contractors, providing invaluable introductions to specialized small businesses that enable the overall contracting community to work successfully in tandem. For more information, visit www.secaf.org.

Expedition Technology Named Best Place to Work By Washington Business Journal

April 5, 2019

The Washington Business Journal has announced the 100 best companies to work for in the DC/NOVA metro area for 2019.

The Best Places to Work list is determined by an employee survey administered by Quantum Workplace which measures employee engagement scores.

An awards reception to be held on Thursday, May 16 at the MGM National Harbor will reveal the rankings of all 100 companies selected in three categories; small, medium, large and extra-large.

This year, there were over 500 companies that applied for inclusion on the list and only 100 were selected.

Inclusion on this list is quite an honor and affirms what we already know: that Expedition Technology is the Best Place to Work!

Expedition Technology Holds First Annual π Day Competition

Who says you can’t have fun at work? Not us!

On 3.14 at exactly 1:59.26, the Expedition team gathered ‘round nine delectable homemade pies to observe, judge and ultimately choose a winning entry. Pies were judged not on taste (though they were delicious) but instead by how creatively they conveyed our good friend π. We were pleasantly surprised by the humorous, subtle, interesting, and just plain tasty solutions they developed.

The results showed once again that engineers (and their spouses) can be very creative with their interpretations of our favorite constant.

Expedition is rapidly growing and hiring, so the competition is sure to be much harder next year. Think you have the creativity to hang with this gang? Drop us your resume. You have a whole year to prepare.

The DARPA AI Colloquium and Expedition

Mr. Mattei’s talk on RFMLS at the AI Colloquium

DARPA’s Artificial Intelligence Colloquium (AIC) is taking place this week at the Hilton Alexandria Mark Center and aims to “highlight recent research results across the full breadth of DARPA’s investment in advancing the state of the art in AI”. As we’ve noted before, Expedition has been at the forefront of this work, architecting novel deep learning architectures and applying them to both image and signal solutions for our country.

But you don’t need to take our word for it. This week our own Mr. Enrico Mattei, a research scientist at EXP, was requested to to present the summary of the goals of DARPA’s RFMLS program. (UPDATE 26 March 2019: The video of his presentation is now available on YouTube and is embedded above.) RFMLS is aimed at mapping the internet of things with machine learning to improve security through spectrum awareness and emitter identification. Enrico, as the Principal Investigator of our RFMLS effort, is one of only a few non-Government attendees asked to give a talk at the Colloquium. We’re especially proud to have him represent the foundational and novel work the team has been doing on the RFMLS effort.

A Review of Focal Loss at Women in Data Science Blacksburg

Summary: I enjoyed listening to talks, meeting Virginia Tech students, and giving a tutorial on deep learning at the Women in Data Science (WiDS) Blacksburg conference. WiDS events all over the world are happening now to encourage and support current and future women in this field. Some of the material from my tutorial on focal loss, intended for people with a basic background in machine learning, is included below with context for an accompanying Jupyter notebook.

Last week I had the opportunity to attend and present at Women in Data Science Blacksburg, the first WiDS regional event at Virginia Tech, hosted by Dr. Eileen Martin, one of my classmates from grad school. For those of you unfamiliar with WiDS, the first Women in Data Science conference was organized and run at Stanford, led by Dr. Margot Gerritsen, who was the director of my department at the time. One of the things I love about WiDS is how from early on there were efforts to have it reach beyond Silicon Valley by encouraging people around the world to host their own WiDS events. At this point, I’ve attended WiDS conferences at Stanford; Cambridge, MA; Washington, D.C.; and now Blacksburg, VA. Video clips and images from various regional events are compiled and broadcast, so for me, there is a sense of this much broader community extending beyond the people in my locality. Speaking of video clips, the Virginia Tech College of Science has already put together a brief video about the WiDS Blacksburg. I hope they continue to support this event in the future.

If you are a woman or male ally interested in data science and machine learning, WiDS Blacksburg was held earlier than most, so there may still be time to register for a WiDS event in your region or participate remotely via the livestream from Stanford.

The tutorial session that I presented at this WiDS focused on focal loss, a variant of the cross-entropy function commonly used by neural networks to perform classification. The paper Focal Loss for Dense Object Detection was published (pre-published?) on arXiv in mid-2017, so it has been around for a bit, but many people are still not familiar with this simple but effective technique. To prepare an interactive example that students could run easily, naturally the first thing I did was search GitHub, because while I could write my own from scratch, let’s be real — I have a day job and a life outside of work, and I strongly believe in minimizing duplication of effort. I found a great example Jupyter notebook and accompanying blog post by user Tony607, forked the repository, and started making changes. I ended up changing a fair amount in order to approach the problem in the way that made the most sense to me and to emphasize certain aspects of how focal loss works. My version of the notebook is available here, although I encourage you to read more of this post before trying it out. (Yes, it’s a toy example with a teeny tiny neural net, and it’s what made the most sense for a live demo.)

In my presentation, I tried to break down the main ideas from the focal loss paper to be more intuitive and digestible for people with less experience in deep learning. Read on for the full explanation, intended for people with a basic background in machine learning, or skip to the last paragraph for a couple sentences’ worth of closing thoughts.

First, let’s take a step back and ask “what problem are we trying to solve?” Say you want to classify each sample from a dataset as one of two classes, and to add a slight complication, the class distribution is imbalanced. (Don’t worry, we can extend focal loss to N classes, but I’m using two for simplicity.) To make this example more concrete, let’s say that the problem is detecting fraudulent financial transactions in a dataset with a large proportion of normal transactions and relatively few fraudulent transactions. In fact, this is the problem used in the Jupyter notebook. You build a neural network and train it for a bit, and it quickly attains the ability to distinguish between normal and fraudulent transactions at a basic level. From this point on, most of the training examples are not doing much to improve your performance because the model is already doing a decent job on them. We will call those “well-classified” or “easy” examples. There is a smaller subset of “hard” examples in the training dataset that are more informative to the model, and focal loss allows us to place more emphasis on those examples.

How does focal loss achieve this objective? Figure 1 from the paper illustrates this well. I’ve taken the original figure from the paper and added my own annotations below. This plot shows curves for the standard cross-entropy loss function and a few variations of the focal loss function, where the variations use different values for the hyperparameter gamma. On the x-axis is the input pt, the predicted probability for the true class, and on the y-axis is the corresponding loss. Consider what happens with a well-classified example — say a training example with a true label of “normal” has a predicted “normal” score of 0.8. Looking at the cross-entropy function, the loss is small and, more importantly, the gradient is small. If we compare that to a hard example, such as a “normal” example with a score of 0.2, where we are not doing well on this example at all, the gradient for the hard example is larger. This is good — the standard cross-entropy loss function already has some built-in ability to place more emphasis on examples where the predictions are further from the truth labels.

However, if we go through the same thought exercise with one of the focal loss curves, we see that the gradient for the well-classified example is even smaller and the gradient for the hard example is even larger. We could interpret this difference in the shape of the loss functions by saying that a model trained with standard cross-entropy loss will continue trying to push scores for the well-classified examples further and further all the way to 1.0, where a model trained with focal loss will not care too much about the well-classified examples and instead work more towards improving on the hard examples. This effect is evident in the Jupyter notebook, and this is a good point to take a look at it and see for yourself.

There are a couple more key parts of the focal loss paper that I want to discuss. First, the application we were just considering is a pure classification problem. Where does object detection fit in? A common standard design for deep learning object detection models uses a grid with several template boxes or “anchor boxes” with different aspect ratios at each cell within the grid, and the model learns to classify each cell as either having an object of interest (a ground truth box that roughly matches an anchor box) or having nothing of interest, i.e. belonging to the “background” class. In this example image from SSD: Single Shot Multibox Detector, only two anchor boxes match the cat, and one matches the dog. The vast majority of anchor boxes do not match a ground truth box, so we have a situation with potentially a large number of easy background examples and a smaller subset of hard examples. For a model like SSD, there is a regression component of the architecture where the model predicts the deltas between the truth bounding box and the anchor box, but focal loss does not directly impact that pathway, and that’s all we’ll say about it here.

Figure 1 from SSD: Single Shot MultiBox Detector

Finally, the focal loss paper also mentions the use of a prior probability for the rare class. Based on my team’s experiments, I can say that adjusting the prior is not necessary (and it is not used in the Jupyter notebook example), but it can help improve performance. The general idea here is that if we know ahead of time that a certain class is very rare (or conversely that a certain class will be overwhelmingly represented) we can initialize the weights for the last layer leading up to classification such that the model is biased toward predicting the rare class with low probability (or predicting the common class with high probability) instead of predicting each class uniformly. The final layer will begin training already able to predict the correct label for most of the examples and just needs to learn to recognize the rare class(es).

I think focal loss was a great topic for this setting because it is easy to implement, general enough to apply to many situations, and based on straightforward reasoning about gradients. Maybe I’m an idealist, but I think you or I could come up with an idea like this, too.

References

  1. Lin, T. Y., Goyal, P., Girshick, R., He, K., & Dollár, P. (2018). Focal loss for dense object detection. IEEE transactions on pattern analysis and machine intelligence.
  2. Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C. Y., & Berg, A. C. (2016, October). Ssd: Single shot multibox detector. In European conference on computer vision (pp. 21-37). Springer, Cham.

EXP at NAML

The Workshop on Naval Applications of Machine Learning

The 3rd Annual Naval Applications of Machine Learning (NAML) workshop was held the 11th through the 14th of February in San Diego. It is hosted by SPAWAR Systems Center Pacific and it featured “oral and poster presentations on technical topics including autonomy, computer vision, and cybersecurity”. It has quickly grown from a great idea to a significant venue for showcasing applied machine learning solutions in our community.

Expedition Technology at NAML

Expedition (EXP) has been attending this conference since the first and is very happy to be participating this year as well. EXP’s CTO, Greg Harrison, presented our most recent work on object detection and tracking in Wide Area Motion Imagery, while Senior Scientist Enrico Mattei outlined our progress in developing state-of-the-art deep learning systems for analysis in the radio frequency realm on our RFMLS project. These topics were also discussed in an open forum at the GTC-DC conference late last year.

The presented projects aim at taking the best approaches in this rapidly evolving field and developing them into deployed, modern solutions in support of the United States. They represent premier examples of the work our team perform here at EXP and why we are proud to be a part of it.

We are hiring!

If you think you would like to spend your days working on solutions to similar problems, apply to one of our positions or reach out to us here or to one of our folks on LinkedIn. We are happy to talk to you about life at EXP.

Oh, The Places We’ve Gone!

Expedition at the GPU Technology Conference

This week the team at Expedition Technology had the opportunity to publicly discuss a couple of the compelling projects we are working here. At NVIDIA’s GPU Technology Conference in DC (GTC DC) we presented results on computer vision and on signal processing. The talks were:

The projects outlined in these talks are great examples of the type of work we tackle here at EXP and are also representative of the state-of-the-art algorithms and results we are developing. If taking on these kinds of big ideas and building solutions to address them is the sort of thing you would love to be doing, drop us a line or check out our current job postings!

Expedition Technology Ranks No. 489 on the 2018 Inc. 5000 With Three-Year Revenue Growth of 1,040 Percent

Inc. Magazine Unveils Its 37th Annual List of America’s Fastest-Growing Private Companies—the Inc. 5000

Expedition Technology Ranks No. 489 on the 2018 Inc. 5000
With Three-Year Revenue Growth of 1,040 Percent

NEW YORK, August 15, 2018Inc. magazine today revealed that Expedition Technology is No. 489 on its 37th annual Inc. 5000, the most prestigious ranking of the nation’s fastest-growing private companies. The list represents a unique look at the most successful companies within the American economy’s most dynamic segment—its independent small businesses. Microsoft, Dell, Domino’s Pizza, Pandora, Timberland, LinkedIn, Yelp, Zillow, and many other well-known names gained their first national exposure as honorees on the Inc. 5000.

“The more than tenfold increase in revenue growth over the last three years is the result of the dedicated drive of our employees and an array of incredibly supportive customers. We have worked hard to align our capabilities with some of the highest priority challenges facing our nation, and we anticipate this positioning will allow us to solve larger and more complex problems in the years to come,” says Marc Harlacher, President and CEO of Expedition Technology.

Not only have the companies on the 2018 Inc. 5000 (which are listed online at Inc.com, with the top 500 companies featured in the September issue of Inc., available on newsstands August 15) been very competitive within their markets, but the list as a whole shows staggering growth compared with prior lists. The 2018 Inc. 5000 achieved an astounding three-year average growth of 538.2 percent, and a median rate of 171.8 percent. The Inc. 5000’s aggregate revenue was $206.1 billion in 2017, accounting for 664,095 jobs over the past three years.

Complete results of the Inc. 5000, including company profiles and an interactive database that can be sorted by industry, region, and other criteria, can be found at www.inc.com/inc5000.

“If your company is on the Inc. 5000, it’s unparalleled recognition of your years of hard work and sacrifice,” says Inc. editor in chief James Ledbetter. “The lines of business may come and go or come and stay. What doesn’t change is the way entrepreneurs create and accelerate the forces that shape our lives.”

The annual Inc. 5000 event honoring the companies on the list will be held October 17 to 19, 2018, at the JW Marriott San Antonio Hill Country Resort, in San Antonio, Texas. As always, speakers include some of the greatest innovators and business leaders of our generation.

Expedition Technology (EXP) offers expertise in algorithm and system development spanning application areas from radar, lidar, imaging and full motion video, to communications, navigation, signal intelligence, and data analytics. With backgrounds as active duty Naval Flight and Air Force officers, engineers, scientists, mission operators and executive managers, the EXP team understands the importance of evaluating challenges from the customer perspective. Our vision is to build EXP into a formidable provider of differentiated image and signal processing products for commercial, defense and intelligence customers.

CONTACT:
Holly Palmer
571-429-6141
info@exptechinc.com

More about Inc. and the Inc. 5000 Methodology
The 2018 Inc. 5000 is ranked according to percentage revenue growth when comparing 2014 and 2017. To qualify, companies must have been founded and generating revenue by March 31, 2014. They had to be U.S.-based, privately held, for profit, and independent—not subsidiaries or divisions of other companies—as of December 31, 2017. (Since then, a number of companies on the list have gone public or been acquired.) The minimum revenue required for 2014 is $100,000; the minimum for 2017 is $2 million. As always, Inc. reserves the right to decline applicants for subjective reasons. Companies on the Inc. 500 are featured in Inc.’s September issue. They represent the top tier of the Inc. 5000, which can be found at http://www.inc.com/inc5000.

About Inc. Media
Founded in 1979 and acquired in 2005 by Mansueto Ventures, Inc. is the only major brand dedicated exclusively to owners and managers of growing private companies, with the aim to deliver real solutions for today’s innovative company builders. Inc. took home the National Magazine Award for General Excellence in both 2014 and 2012. The total monthly audience reach for the brand has been growing significantly, from 2,000,000 in 2010 to more than 18,000,000 today.  For more information, visit www.inc.com.
The Inc. 5000 is a list of the fastest-growing private companies in the nation. Started in 1982, this prestigious list has become the hallmark of entrepreneurial success. The Inc. 5000 Conference & Awards Ceremony is an annual event that celebrates the remarkable achievements of these companies. The event also offers informative workshops, celebrated keynote speakers, and evening functions.
For more information on Inc. and the Inc. 5000 Conference, visit http://conference.inc.com/.

For more information contact:
Inc. Media
Drew Kerr
212-849-8250
dkerr@mansueto.com

Fighting GAN Mode Collapse by Randomly Sampling the Latent Space 

At Expedition Technology (EXP) we develop a broad set of deep learning solutions for our customers. Each deep learning development cycle typically starts with

  • Understanding the problem space
  • Getting acquainted with the research landscape
  • Tweaking an existing algorithm or developing entirely new architectures
  • Training on an army of GPUs

This is the standard process, but with a constraint: it requires very large diverse data sets to get good results. As many of our customer’s problems grow more sophisticated, absence of that constraint is becoming an ever rarer occurence. In these cases where data is scarce, there is a necessary additional step – amplifying the data that you have.

For help with this, we have been turning to Generative Adversarial Networks (GANs). Despite their wide-ranging success, deep generative methods are hindered by well-known drawbacks such as unstable minima and mode collapse. We have recently made progress regarding the latter and would like to share our methods with the rest of the deep learning community. In this post we will introduce GANs, describe mode collapse, and then explain how we’ve attempted to mitigate this problem while adding justifications and results to support our claims.

GANs

Generative Adversarial Networks [1] (GANs) are an incredible technology. Although classification and segmentation are necessary problems, they don’t have the catchy, easy-to-appreciate results GANs do. After all, you can’t become a great artist just by learning to distinguish Van Gogh from Monet. You have to actually pick up a paintbrush and try your hand at it. Similarly, if we strive to make intelligent systems, they must be able to not only discriminate, but to generate believable outputs. That’s where we cross the border from a passive to an active agent.

[6] – Architecture for a GAN generating MNIST digits

GANs operate by combining two networks – one that creates output, and one that provides feedback. The ‘generator’, as it’s called, is provided a random input and tries to return a correspondingly random output. The ‘discriminator’ then compares this generated sample to real world ones and gives a zero to one score of how believable it is. It’s really just a competition: the generator is trying to fool an ever-improving discriminator. If you let them duke it out a few million times, you end up with a discriminator that learns the real world from the fake world, as well as a generator that does a pretty good job at making realistic looking samples.

This is a powerful tool, as it theoretically allows for creating unlimited additional data. If the generated samples are within the set of all possible inputs, then we can turn 100 data points into 1000 by letting the generator hallucinate 900 new but plausible examples.

Mode collapse

There’s a problem, though. Let’s look at the following situation [2] as a GAN tries to make pictures of cars:

  1. After bumbling around for a bit, the generator learns to draw convincing Honda Civics
  2. The discriminator picks up on this and starts labeling most Honda Civics as generated
  3. In response to this, the generator tweaks its algorithm a bit and begins making a similar but separate class – Honda Accords
  4. Now the discriminator has to adjust, so it starts calling Honda Accords fake
  5. While the discriminator is distracted by Accords, the opportunity presents itself to start making convincing Civics again, which the generator happily reverts to
  6. Repeat steps 2-5

This infinite loop of similar outputs is termed mode collapse, and it is one of the things restricting GANs from being widely used as a data amplification tool. The consequence of mode collapse is that we cannot create an unlimited supply of unique samples, since our generator only flicks back and forth between a couple very similar outputs. This minimally satisfies the job of fooling the discriminator but is ultimately unhelpful if we are trying to stretch the effectiveness of our currently available data.

How to avoid mode collapse

To reconcile this, we decided to add a constraint: the generator outputs must be random, but in such a way that any such random output is believable. An intuitive way to enforce this is to find some compressed space Χ that is densely packed with examples, such that any point within that space corresponds to a true data sample. If we can also find a bijection f: Χ→Y from X, our densely packed space, to Y, our space of real examples, then we can randomly sample Χ, and convert those points to plausible outputs.

Luckily for us, autoencoders are great at finding exactly such a space and such a function. The basic idea is that an autoencoder takes input, processes it to a lower dimensionality vector, then reconstructs the input from that vector. The bottleneck in the middle, then, contains the relevant information about the input with fewer variables, providing us a compressed space, referred to as the latent space. The decoder, given a point in that space, recreates the input that was encoded, which provides us with our bijection f. This relies on two assumptions that we will provide evidence for in the next section.

[5] – Architecture for an autoencoder that compresses MNIST digits

What does this all mean? If we set up an autoencoder to densely encode inputs to a latent space, then any randomly sampled point in that latent space should give a realistic, equally random output upon decoding. Somewhat surprisingly, with a small enough dimensionality of the latent space, this actually works.

Our architecture for the L-GAN

To employ this effectively, we make a small GAN that finds a sub-basis of this latent space, and then take random samples from this sub-basis. In practice, this means that we train a GAN to generate a batch of vectors, enforce that they are orthogonal using their dot product, and then take random linear combinations of these vectors. The discriminator then decides whether these linear combinations are convincing latent space encodings. Those that fool the discriminator get decoded into realistic samples. Due to the sampling being random and the decoder being a bijection, our results are random elements that are indiscernible from the true data. See the figure below for some examples of non-cherrypicked eights generated by the network.

Random 8’s generated by our GAN + Decoder

The reason for having the GAN find a sub-basis is that it is difficult to find a perfect dimensionality of the latent space. This means that not every one of the axes is guaranteed to be utilized evenly. Therefore, it is more sensible to choose a dimensionality that allows the autoencoder some leniency, and to then let the generator learn the necessary basis of ‘highest plausibility’.

This approach is reminiscent of variational autoencoders (VAEs) [4], which also encode the data samples for the purposes of generation. VAEs, however, sample the latent space differently, electing instead to add random std. normal vectors to the encodings. In a VAE, the normal vectors are based on a mean and standard deviation that are also created by the encoder. In our approach, the encoder simply defines the latent space, which is then sampled by a wholly separate GAN.

Reasoning for why this works

There are two critical assumptions that substantiate our approach:

  1. The latent space is densely packed
  2. The decoder approaches a bijection

We provide two points of evidence to show that the latent space is densely packed. The first is a thought experiment. Given inputs that have 10 independent variables, and an encoded vector of length 5, we should expect that an autoencoder learns to utilize every degree of freedom to its fullest extent. If, instead, it only uses three axes of the five provided to it, the autoencoder will be further from representing the ten independent variables of the input space, implying that an easy lower minimum is available on the error landscape. This presents the caveat that our encodings need to be smaller in dimensionality than the number of independent variables in the input space. Such a requirement ensures that the optimal encoder takes advantage of every axis provided to it. Simply said, if you don’t give the encoder adequate dimensionality to represent the information, it must learn to take advantage of everything it has.

The second point is empirical, as seen by traveling through a latent space. It turns out, if we encode two handwritten MNIST digits to a latent space, the points between their encodings also represent plausible outputs, as seen in the figure [3] below. This implies that, given two known points in latent space, any point randomly between them is likely to also represent believable outputs. Our approach treats the latent representations differently by making a unique space for each digit, rather than a single latent space for all of them. In either case, the result should still hold.

[3] – Movement in the latent space from the encoding of a five to the encoding of a nine

Towards the second assumption, it is not true that the decoder is a true bijection, in part due to the discrete nature of the dataset. However, we can make a case that the decoder of a functional autoencoder will approach a bijection, as long as the encodings map to a densely packed space. We do this by showing that the encoder approaches a bijection from true inputs to a unique point in the latent space. The decoder then, as the inverse of the encoder, must learn the inverse bijection.

Before explaining the reasoning for the decoder being a bijection, we want to touch on why this is necessary. A bijection is a function fY that is both ‘onto’ and ‘one-to-one’. This means that any possible value O ∈ {Outputs} has exactly one corresponding input I for which f(I) = O. If both the encoder and the decoder are bijections, then any point randomly sampled in the latent space must have a unique, correspondingly random point in the true data space.

We can claim that the encoder is ‘onto’ as a consequence of our reasoning for the latent space being densely filled. In order to fill that dimensionality, the encoder must attempt to map the inputs into different locations within the latent space. As such, if the whole constrained-dimensionality latent space is filled, then the encoder is onto. We can also show that a working autoencoder’s encoder is ‘one-to-one’ by contradiction. If it were not one-to-one, then two different inputs could map to the same latent representation. Due to the assumption that the autoencoder is functional, this point in the latent space would be decoded back out to the two different inputs. This is not possible by the definition of a function. As such, an optimal encoder approaches a bijection, therefore the decoder must also do the same.

These assumptions come together for the logic of our generative approach. Autoencoders can find a latent space in which every point maps to plausible outputs, and simultaneously approximate the bijection between this latent space and the output space. Therefore, randomly sampling the dense latent space corresponds to randomly sampling the set of realistic data samples. The quality of decoded samples is then a direct result of how ‘bijective’ the encoding and decoding operations are.

Results

The ultimate goal is to amplify our existing data by generating new samples that are indiscernible from the original set. To this end, we set up an experiment where we trained a basic MNIST classifier on the full train set, on a tenth of the train set, and on a tenth of the train set along with generated samples. The GAN in this case was also trained on the same tenth.

We trained the GAN on each digit independently and created 5000 new samples for each. Upon training the classifier with GAN input, we split each batch as either 25, 50 or 75 percent composed of generated digits. The rest of each batch was taken from the tenth of the train set.

We found that the network trained on a tenth of the dataset plus generated samples is more accurate on the test set than the network trained without generated samples. Specifically, we see a decrease in the error rate of up to 17% after training on our amplified dataset.

Train setAll train dataTenth of train data Tenth of train data and generated 75/25Tenth of train data and generated 50/50Tenth of train data and generated 25/75
Test set accuracy96.85%94%94.3%95%92.6%

 

 

References:

  1. Goodfellow, Ian, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. “Generative adversarial nets.” In Advances in neural information processing systems, pp. 2672-2680. 2014
  2. Nibali, http://aiden.nibali.org/blog/2017-01-18-mode-collapse-gans/
  3. Despois, https://medium.com/@juliendespois/latent-space-visualization-deep-learning-bits-2-bd09a46920df
  4. Kingma, Welling. “Auto-Encoding Variational Bayes.” https://arxiv.org/pdf/1312.6114.pdf
  5. Chollet, Building Autoencoders in Keras”, https://blog.keras.io/building-autoencoders-in-keras.html, 2016
  6. Chablani, “GAN – Introduction and Implementation”, https://towardsdatascience.com/gan-introduction-and-implementation-part1-implement-a-simple-gan-in-tf-for-mnist-handwritten-de00a759ae5c, 2017