Encouraging Signals – A Project Discovery Update
To discuss this Devblog, click this link to head over to the EVE Online forums.
A message from François Bouchy, Maxime Marmier and Oliver Turner – members of Michel Mayor’s team in Switzerland.
In July 2017, all 176,802 light curves obtained by the CoRoT space mission were injected into EVE Online with the aim of detecting new transiting exoplanets. At the astronomy department of the Geneva University, a first analysis of the 44.4 million classifications provided by 77,709 players in 190 days is underway. This is a staggering number of classifications; for comparison, the Zooniverse Galaxy Zoo Project received 50 million classifications during its first year. Hundreds of light curves, previously thought to have no transits, have been identified by the pilots of EVE Online as new transit candidates with a very high consensus. It is important to note that a high consensus does not mean the events are new planets. So, these new events are presently being analyzed in more detail by the astronomers of Geneva University and cross-checked with other astronomical databases.
Much like the galaxy of New Eden, the work of finding planet transits is fraught with difficulty, even for professionals. As such, a high fraction of these new candidates corresponds to artefacts in the light curves due to instrumental effect, intrinsic stellar variability or eclipsing binary stars. In spite of this, dozens of the detections checked so far present the typical shape of an exoplanet signature. This goes to show that, though only a fraction of the data supplied has been cross-checked by astronomers so far, EVE Online players are making a remarkable impact on the analysis of the CoRoT light curves. Their efforts have proved particularly valuable in finding transit-like signals buried in those caused by stellar variability and individual, non-repeating transit events. The latter are of great interest as they may turn out to be planets in very long period orbits which are currently under-represented and poorly studied.
The following is a small selection of particularly interesting objects:
A (potential) Jupiter-like planet with a variable host star:
A (potential) long-period Jupiter-like planet:
A (potential) Neptune-like planet (from a somewhat noisy light curve):
Regardless of their diverse in-game roles (be they haulers, combatants, spies or the like), the pilots of EVE Online have shown themselves to be effective, real-life explorers of the galaxy.
Although we are well aware of the EVE Community’s tendency to surpass expectations, we may yet again have underestimated the capsuleers’ capacity for science. Rest assured, however, there’s plenty for Project Discovery to do, as we try to discover something new in all the stars in the sky that the CoRoT telescope has seen twinkling.
A message from the Project Discovery devs
Project Discovery is always a continuous process. Along the way, we’ve found some issues, some solutions and some improvements to guide us to even greater heights.
Working on Project Discovery has been a unique challenge for several reasons, not least of which has been the sheer amount of data we are working with. The magnitude of the data makes checking it by hand more or less impossible, so when problems arise they need to be classified in such a way that we can automatically detect problematic samples, since combing through the data by hand would be infeasible.
With the help of capsuleers in EVE Online, however, this data becomes much more manageable and we’ve already found and eliminated several issues.
We’ve even reached the point of sending the first batch of data off to Geneva for follow up research.
Initial Patterns and Problems
At launch, we employed a "training period" system from the previous Project Discovery, where players would be presented with increasingly difficult tasks until they had solved at least one at each difficulty level.
This was used successfully with the Human Protein Atlas without any issues. However, after launching Exoplanets on Tranquility we quickly encountered an issue where players were getting stuck in this training period because they couldn't solve the higher difficulty tasks. This would cause them to get stuck in a loop where they would only be served the higher difficulty tasks we had to offer as they had completed all lower difficulties, and they wouldn’t be able to move onto analyzing unsolved samples.
This resulted in an extremely frustrating experience, with most players having to try and guess their way out of the training period with extremely low accuracy scores.
After investigating the situation, we determined that this was primarily caused by the fact that the most difficult tasks in the Exoplanets project were much more difficult than the most challenging Human Protein Atlas tasks.
We ended up solving this by removing the highest difficulty of task (level 5) from this training period and resetting the accuracy of all players that had dropped below 50%. We decided to only reset those below 50% since anyone above that accuracy rating had proven themselves adept at the task.
These players who were getting stuck in the training set also raised another issue - some of the level 5 difficulty tasks were simply too hard for many of our players to solve at all. Watching the feedback from players and examining the samples they were having trouble with led us to the discovery that some of them were only solvable by looking at other data alongside the curve or by using frequency analysis, since the transits were obscured by the general noise of the sample.
After some discussions with Michel Mayor's team, we concluded that these could safely be excluded, as they couldn't realistically be found by eye and didn't further the scientific goals of the project.
The Many Paths of Discovery
As some of you know and have pointed out, there are many powerful methods for frequency analysis that already exist such as Fourier Transforms, Least-Squares Spectral Analysis and Box-Fitting Least Squares Algorithms.
So why don’t we use them?
Project Discovery is aimed at compensating for weaknesses inherent in those algorithms, namely their reliance on perfectly periodic data and their ineffectiveness at dealing with noise.
While expecting perfectly periodic data sounds like a fair assumption when dealing with the orbits of planets, systems that include multiple planets in close proximity don't exhibit perfectly periodic orbits, since their gravitational pull affects the orbit of other planets in the same system in a phenomenon known as Transit Timing Variation (TTV).
TTV can create small variances in the period of the transiting planet which severely inhibits the frequency analysis-based algorithms used to automatically search through this data. However, the wobbles are generally small enough that they can still be selected with our marker tool without going outside the allowed margin of error. A visual example of this is shown in this video.
Aside from TTV, there are also two other types of transits where Project Discovery is expected to outperform algorithmic analysis.
Firstly, we have transits in systems with strong stellar activity, where the algorithms can’t get a clean fold, even if the transit can be seen clearly with the naked eye.
Secondly, there are circumbinary planets, which are planets orbiting binary stars. Frequency analysis-based algorithms have trouble with these because the relative position of the stars and the planet can be different each time the planet transits.
Tweaking Tools and the Dataset
Since the release of the new Project Discovery, we have made several changes. Firstly, we have removed all samples where the signal-to-noise ratio is too low, since they aren’t fair and don’t significantly contribute to the scientific effort.
We noted a large number of players having trouble with marking the transits accurately for high frequency transits. This is caused by the fact that a small error in period adds up cumulatively for each marker, causing them to eventually fall outside the allowed margin of error. We attributed this to problems with reading whether the transits are perfectly aligned in the folded mode.
To combat this issue, we added a magnifying glass on the minimap. This allows players who often encounter this issue to check whether each individual marker falls within a transit by looking at a zoomed in version of the full unfolded graph at any time during the folding process.
We have also, at the suggestion of our players, added a feature that allows players to examine multiples of the selected period, which is especially useful for the situation when multiple transits become visible during the folding phase aside from the one being folded.
When building this project, we became aware that the solutions for eclipsing binaries were flawed due to the inclusion of only the primary eclipse and not the secondary one. As such, we removed them from the dataset until their solutions could be adjusted. However, after releasing the feature we became aware of a separate type of binary that had the same problem, causing players to encounter samples where half the transits were correctly part of the solution and half of them were not. These samples have been removed from the data set.
Final Remark:
Once again, the contributions of EVE Online pilots to this groundbreaking project cannot be understated. Thank you to MMOS and the University of Geneva for allowing our pilots the opportunity to bring their expertise in navigating the galaxy into the world of real science.
MMOS recently published a blog about the results so far, which you can check out here.
This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement Nr 732703.