menu

Tax Management Associates, Inc.

 64,171

GIS Solutions Challenge

Develop an open source solution to solve common GIS problems.

This challenge is closed

stage:
Closed
prize:
$45,000

This challenge is closed

more
Summary
Timeline
Updates12
Forum8
Teams185
FAQ
Summary

Overview

Geographic Information Systems (GIS) power our world. GIS helps us find the quickest way to a destination, map out property boundaries across a county, and even allows emergency responders to better prepare for natural disasters. GIS is the underbelly of so many functions we rely on, and yet, it still has a long way before being fully optimized, reliable, and efficient for daily tasks, or major problem-solving.

The GIS Solutions Challenge is seeking innovators to build a set of tools which the open source GIS community can use to discover specific, scalable, useful, and reliable business insights.

 

Why issue a Challenge?

Extremely large organizations using GIS have developed internal systems to increase the accuracy, efficiency, and reliability of their GIS processes when handling large amounts of data. While many large organizations utilise expensive GIS systems, many resource-constrained organizations and individual innovators turn to open source and affordable platforms. Although small organizations have access to open source GIS tools, these technologies do not allow for the analysis of large datasets. Be it a lack of computational power, speed, or accuracy, current open source tools for smaller organizations are lacking. Bringing the open-source and GIS communities together to solve this issue can not only help us at Tax Management Associates derive new business insights in for local governments, but it can help to improve the very systems of direction, safety, and business we all rely on.

 

The Challenge Breakthrough

The GIS Solutions Challenge asks innovators to develop scalable, efficient, and effective open source tools that generate useful business insights from geospatial data, which can solve three specific GIS problems for large datasets (please see the challenge guidelines for a complete description):

  1. What is the geodesic distance between two features?
    1. E.g., A particular street corner in Detroit is known to be a crime hotspot. How far is this hotspot from the area the police actively patrols? This distance would be measured as a straight line from point to the edge of a polygon.
  2. What is the network distance between two features?
    1. E.g., What is the actual distance police must travel from the edge of their patrol to reach a crime hotspot? This distance would take into account the specific route the police must travel to reach the hotspot.
  3. Is a point inside or outside a polygon?
    1. E.g., Is the crime hotspot within a police patrol area?

Innovators will be provided three sample data sets to solve the above challenge and will be asked in Phase 1 to create and share a proof-of-concept, which can then be used in Phase 2, where innovators will need to develop a fully functional GIS solution that will be tested against a number of technical requirements, such as efficiency, effectiveness, usefulness, innovativeness, and accuracy, among other factors. Competitors can enter Phase 2 even if they did not enter Phase 1. Beyond a cash prize, the winners will have contributed to creating an open-source GIS solution that that can benefit people and organizations globally.

 

What You Can Do Right Now

  • Click Accept Challenge above to register for the challenge and subscribe to updates
  • Read the full details in the competition guidelines
  • Introduce yourself in the forum
  • Share this challenge with your friends, family, and colleagues!

 


Guidelines

The GIS Solutions Challenge asks innovators to develop scalable, efficient, and effective open source tools that generate useful business insights from geospatial data, which can solve the three specific GIS problems listed below.

 

Challenge Goal

Competitors will need to develop an solution to answer one or more of the following questions using an open source analytics platform:

1) What is the geodesic distance between two features?

  • Between a point and the closest edge of a polygon
  • Between a point and another point
  • Between the closest edge of a polygon to the closest edge of another polygon

E.g., A particular street corner in Detroit is known to be a crime hotspot. How far is this hotspot from the area the police actively patrols? This distance would be measured as a straight line from point to the edge of a polygon.

2) What is the network distance between two features?

  • Between a point and the closest edge of a polygon
  • Between a point and another point
  • Between the closest edge of a polygon to the closest edge of another polygon

E.g., What is the actual distance police must travel from the edge of their patrol to reach a crime hotspot? This distance would take into account the specific route the police must travel to reach the hotspot.

3) Is a point inside or outside a polygon?

  • Is the point completely within the polygon (not including features on the boundary) -- “completely contain within”
  • Is the point within the polygon (including features on the boundary) -- “contain within”
  • Is the point only touching the boundary (so neither in or out) -- Clementini

E.g., Is the crime hotspot within a police patrol area?

 

Timeline

The timeline for the challenge can be viewed here.

 

Phase 1

Provide a proof-of-concept for your GIS Solution. This needs to include a diagram explaining the solution, a sample workflow of your solution, and descriptive information, including why the business insight generated from your tool will be useful to your target audience. 

  • Prizes: Up to 10 prizes of $1,000 each
  • Feedback will be provided to winning entries
  • Non-elimination: competitors may enter Phase 2 regardless of whether they entered or won Phase 1
  • Duration: 6 weeks

 

Phase 1 Judging Criteria

 

Criteria

Description

Score

Technical feasibility

  • How feasible is the solution based on modern open source GIS capabilities?
  • Does the solution’s diagram flow in a reasonable manner?
  • Is the solution backed by any GIS science?

50

Usefulness

  • How useful is the business insight to the target audience?
  • How well does the solution address the described problem?
  • Does the solution generate an output that can be interpreted by someone without data science expertise (i.e., visual representation of output)?

40

Innovativeness

  • How innovative is the business insight?
  • How original is the development compared to existing tools and solutions?

10

 

Phase 2

Develop a fully functional GIS Solution. This involves submission of open source code for the solution and finalised documentation, including metadata on each aspect of your solution. Solutions will be evaluated based on speed, scalability, overall architecture, and the business insight they produce. Please see complete judging criteria below.

  • Duration: 6 weeks
  • Prizes: A total prize pool of $35,000 is available for Phase 2. TMA plans to award 3 or more solutions with the minimum prize being no lower than $2,500.

 

Phase 2 Minimum Requirements

All solutions must adequately meet the Phase 2 Minimum Requirements in order to be eligible for a prize. Solutions which meet these Phase 2 Requirements will be ranked against the Judging Criteria below.

 

Solution Languages

Java, R, or Python are the acceptable languages for the challenge.  These languages have been selected for their broad open source community in relation to data science, as well as their compatibility with the KNIME Analytics platform.  It is possible to use another language so long as that language can be run in a Java, R, or Python environment (such as Scala can be compiled into a JAR and run with Java).

 

Submission Format

Submissions should be contained in a git repository hosted on https://github.com.  The repository can be public, or marked as private and shared with tma1-dev. At the root of the repository should be a README file that contains instructions for configuring and running the solution (see Documentation below for README requirements).

The solution must be able to run in a linux terminal (headlessly). The headless part of the solution will be used to create evaluations for quantitative judging metrics. 

Although not a requirement, it is strongly recommended that your solution integrates with the KNIME platform. Integration with the KNIME platform makes GIS tools more accessible to local governments and other end users. Integration with the KNIME platform can be done as a new KNIME node, as a workflow containing the solution, or simply as a set of instructions that explains how to integrate the source code of the solution into a given set of KNIME nodes.

At a minimum, the solution must accept CSV and shapefiles as input and generate CSV files as output. Additional marks will be given to solutions that accept other types of data input and generate more user friendly output (e.g., map plots like a png or shapefile) as detailed in the judging criteria.

All components of the solution must be freely available for commercial use or licensed as LICENSE_LIST_GOES_HERE.

 

Documentation

Submissions must be well documented for ease of use and ease of understanding. A submission with poor documentation will not be eligible for a prize.

Innovators must have a README file at the root of their git repository that contains instructions for setting up the solution. The README must contain the following:

  • Clearly explain how to install all necessary dependencies
  • If applicable, document how to configure a KNIME workflow to use the solution and how KNIME integration testing should be performed

Innovators must also provide USAGE documentation either in the README file or separately. Code should be well commented.

 

Testing Environments

All solutions will be tested headlessly in a linux terminal on Google Cloud Compute Engine using an n1-standard-4 instances in the us-east1-c region.  The instances will run Ubuntu 18.04 and  KNIME Analytics Platform (Desktop) 3.6.  For the terminal based installation, any relevant dependencies will be installed based on the solution’s README in order to run the solution in a headless fashion.  We reserve the right to reject dependencies that require insecure configurations to run (such as adding an unknown apt repository).

 

Performance

Solutions will be evaluated against a baseline and against all other submissions for speed.  Each solution will be evaluated for accuracy, and scalability.  For the Detroit crime dataset, comparing geodesic distances of patrol area centroids to crimes for 10,000 data points of crime takes 145 seconds on average according to our baseline.  Performing a spatial join of 162,449 points from the Africa conflict dataset to a shapefile of the African Continent took an average of 6.4 seconds  

 

Additional Requirements

  • Solution runs as expected after following instructions.
  • All solutions must analyze geo-spatial data to provide business insights in a format that is useful to a non-data scientist.
  • All solutions must scale to other data sets of similar size and complexity
  • Solution does not already exist in open source analytics platforms, such as KNIME, R, or similar
  • If developed in an existing open source analytics platform, the solution must also adhere to guidelines for tools or nodes on that specific platform. For example, a node developed in KNIME must adhere to KNIME node submission guidelines.
  • If selected as a winner, the solution must be made open source

 

Phase 2 Judging Criteria

Criteria

Description

Score

Quantitative evaluation will be performed using the criteria provided below. Your solution will be ranked out of all available solutions and your position in the ranking will determine your score. In order to be eligible for a prize, your solution must meet or exceed the baseline for speed, accuracy, and scalability as detailed in the Performance section above.  

Speed

  • Speed will be ranked based on the speed of processing 10,000 records.
  • In the event there are multiple submissions that are competitive at 10,000 records, solutions will be ranked based on the speed of processing 100,000 records.

20

Accuracy

  • Measured against our baseline methods for acquiring this data, how well does this solution meet that baseline.

10

Scalability

  • Scalability will be ranked based on the number of records the solution is able to process without crashing.  The solution will be tested with 1,000 records, 100,000 records and up to 500,000 records.
  • Can the solution be run with datasets other than just the sample?

10

Qualitative evaluation will be performed by the judging panel. Solutions will be scored based on their ability to meet or exceed the judging criteria

Usefulness

  • How useful is the business insight to the target audience?
  • How well does the solution address the described problem?
  • Does the solution generate an output that can be interpreted by someone without data science expertise (i.e., visual representation of output)?
  • Does the solution solve for multiple types of problems (e.g., allow for calculations of point-point distance and point-polygon distance)? See “Challenge Goal” for all options

30

Innovativeness

  • How innovative is the business insight?
  • How original is the development?

15

Ease of Understanding

  • How easy was the solution to understand and evaluate on a technical level, based on documentation provided?
  • Does the solution run as expected headlessly after following instructions?
  • Did the solution make use of an integration with KNIME to ease understanding and evaluation? Was the integration easy to use?
  • If applicable, does the solution run as expected when integrated into KNIME?

10

KNIME Integration

  • Does the solution work in KNIME Analytics Platform?
  • Is the solution a KNIME workflow?
  • Is the solution a custom KNIME node?

5

 

Data

We have included to sets of sample data for use when testing your submission.  These sets are for sampling and testing purposes, and other datasets are welcome and encouraged.  If possible, we strongly recommend providing any additionally tested data sets with your solution.

The Africa Excel file dataset contains 150,000+ incidents of conflict that have a geolocation available as a longitude and latitude.  We have also provided a ESRI Shape file format of the African continent which was sourced from http://www.maplibrary.org/library/stacks/Africa/index.htm.  These sets of data have been provided to explore point-in-polygon, resource leveling (optimal distance to center of most conflicts), and other solutions.  

  1. https://assets.tma1.io/herox/Africa.zip
  2. https://assets.tma1.io/herox/Africa_Crime.xlsx
  3. https://nbviewer.jupyter.org/github/tma1/herox/blob/master/notebook/Africa.ipynb

The Detroit CSV file contains 130,000+ crimes committed in Detroit available with a geolocation available as longitude and latitude.  These can be juxtaposed with the Detroit patrol areas shape files that have also been provided.  Some of the solutions that can be explored in this dataset are point-in-polygon as well as various distances.  These distances can be point to point or point to closest polygon edge.  They can be computed using euclidean, geodesic, or network distance  algorithms.  (In this case, the network is Detroit streets, a dataset that has not been provided). All of the Detroit data was gathered from https://data.detroitmi.gov/

  1. https://assets.tma1.io/herox/DPD_Crime_Incidents.csv.zip
  2. https://assets.tma1.io/herox/DPD_Scout_Car_Areas.zip
  3. https://nbviewer.jupyter.org/github/tma1/herox/blob/master/notebook/detroit.ipynb

 

Rules

Participation Eligibility:

The challenge is open to all adult individuals, private teams, public teams, and collegiate teams. Teams may originate from any country. Submissions must be made in English. All challenge-related communication will be in English.

No specific qualifications or expertise in the field of GIS is required. Prize organizers encourage outside individuals and non-expert teams to compete and propose new solutions.

To be eligible to compete, you must comply with all the terms of the challenge as defined in the Challenge-Specific Agreement.

 

Registration and Submissions:

Submissions must be made online (only), via upload to the HeroX.com website, on or before the deadlines outlined in the Timeline. Please see the submission form for any document upload format requirements. No late submissions will be accepted.

 

Intellectual Property Rights:

If an innovator is awarded a prize, the Sponsor will require all content and assets submitted as part of a Finalist’s Submission to be released under open source licenses that permit free distribution, derivative works, and use in commercial and non-commercial settings. Please see the Challenge-Specific Agreement for complete details.

All Innovators are welcome and encouraged to depend on or make use of other components, libraries, content, assets, and code. All such materials must be available under any Open Source Initiative (OSI) or Creative Commons license compatible with the OSI or Creative Commons license under which the Submission will be released. “Compatible” means that each Innovator’s entire Submission must be usable without violating the license terms of those components licensed under the CC BY 4.0 license, Apache License 2.0, or respective OSI license for the components. Source code licensed under the LGPL, BSD, MIT, or Apache licenses currently meets this criterion; other open source licenses may also meet it. If Innovators make modifications to existing open source projects, they are strongly encouraged to submit patches upstream and work to have them accepted. Patches that are not accepted upstream may be submitted as part of the code developed by the Innovator, under the same Apache License 2.0. Content and assets must be licensed under terms that permit commercial usage. The Creative Commons CC BY and CC-BY-SA licenses currently meet this criterion. Innovators cannot submit entries that include or rely on software or content that is either closed-source, proprietary, illegally sourced, or depends on per-seat licensing.

 

Selection of Winners:

Based on the winning criteria, prizes will be awarded per the Judging Criteria section above. In the case of a tie, the winner(s) will be selected based on the highest votes from the Judges.

 

Additional Information

  • Void wherever restricted or prohibited by law.
  • No purchase or payment of any kind is necessary to enter or win the competition.
  • All ineligible applicants will be automatically removed from the competition with no recourse or reimbursement.
  • All applications will go through a process of due diligence; any application found to be misrepresentative, plagiarized, or sharing an idea that is not their own will be automatically disqualified.
  • By participating in the challenge, each competitor agrees to submit only their original idea. Any indication of "copying" amongst competitors is grounds for disqualification.

 

Timeline
Updates12

Challenge Updates

Announcing the Results of the GIS Solutions Challenge

March 1, 2019, 3:24 p.m. PST by Kyla Jeffrey

First of all, thank you so much to everyone who has followed the GIS Solutions Challenge, shared it with your friends and participated!

Unfortunately, only one solution was able to successfully run on the testing dataset and it did not meet the benchmarks set by the competition. As such, no innovator is eligible for an award and we are regretfully closing the competition.

While we were unable to accomplish the goal of the GIS Solutions Challenge; we are grateful to this community for their participation and are excited by the potential for crowdsourcing solutions to common GIS problems.


Last Chance -- 1 Hour Remaining

Dec. 20, 2018, 12:59 p.m. PST by Kyla Jeffrey

This is your FINAL reminder. You have exactly one hour to submit your solution to the GIS Solutions Challenge!

Any last minute questions? Comment directly on this post and we'll get back to you as soon as we can.

Best of luck!


6 Hour Warning!

Dec. 20, 2018, 7:59 a.m. PST by Kyla Jeffrey

If you're still completing your submission, you have exactly 6 hours left to complete it!

Here's a Tip:

HeroX recommends innovators plan to submit with at least a 3-hour window of time before the true deadline. Last-minute technical problems and unforeseen roadblocks have been the cause of many headaches. Don't let that be you!

We look forward to seeing your entries!

 


Nine Day Warning

Dec. 11, 2018, 3:07 p.m. PST by Joeana Lutha Mae Villo

There are only NINE DAYS left to submit your solution for the GIS Solutions Challenge . The early bird definitely gets the worm - so don't put it off! Be sure to have at least 75% of your submission complete a full week before the deadline for maximum flexibility. We'd hate to think you worked hard on a submission, just to miss the deadline by a hair (that's right - no late-night, sad email exceptions - the cut-off is real, folks!)

All that aside, thanks so much to all of you for your interest. Crowdsourcing is nothing without the crowd, and well, that's you. Yeah, you.  

We can't wait to see what the winning solution looks like.

Best of luck to all!


Phase 1 Finalists Announced!

Nov. 8, 2018, 2:56 p.m. PST by Kyla Jeffrey

Thank you to everyone who submitted to Phase 1 of the GIS Solutions Challenge!

We received six entries in Phase 1 and are thrilled to announce that four of the six entries meet the requirements for a Phase 1 prize! Each of these finalists will receive a $1000 prize to support them in their Phase 2 entry! In addition, $6,000 will be added to the total prize purse for Phase 2.

  • EasyGIS
  • GeoTrue, Open Source Map Routing
  • GeoTrue SQL Queries
  • Clara's ultimate GIS problem solver

Didn't submit to Phase 1 or didn't receive an award for your entry? Not a problem -- you are still able to enter Phase 2!


Forum8
Teams185
FAQ