Tennis API

I published a tennis API on

This API has methods to query all tennis matches on the ATP since 1968,  along with players and qualifications. Dynamic filtering, sorting and pagination of the results are allowed.

Example request to get matches played on Clay that lasted over 350 minutes (almost 6 hours!):

curl --get --include '{"minutes":">350", "surface": "Clay"}' \
  -H 'X-Mashape-Key: {mashape_key}' \
  -H 'Accept: application/json
  "_items": [
      "id": 125145,
      "tourney_id": "2004-520",
      "tourney_name": "Roland Garros",
      "surface": "Clay",
      "draw_size": 128,
      "tourney_level": "G",
      "tourney_date": "2004-05-24",
      "match_num": 61,
      "winner_id": 102148,
      "winner_seed": 0,
      "winner_entry": "",
      "winner_name": "Fabrice Santoro",
      "winner_hand": "R",
      "winner_ht": 178,
      "winner_ioc": "FRA",
      "winner_age": 31.4552,
      "winner_rank": 58,
      "winner_rank_points": 662,
      "loser_id": 103096,
      "loser_seed": 32,
      "loser_entry": "",
      "loser_name": "Arnaud Clement",
      "loser_hand": "R",
      "loser_ht": 173,
      "loser_ioc": "FRA",
      "loser_age": 26.4339,
      "loser_rank": 33,
      "loser_rank_points": 1005,
      "score": "6-4 6-3 6-7(5) 3-6 16-14",
      "best_of": 5,
      "round": "R128",
      "minutes": 393,
      "winner_ace": "22",
      "w_df": 2,
      "w_svpt": 229,
      "w_1stIn": 108,
      "w_1stWon": 77,
      "w_2ndWon": 60,
      "w_SvGms": 36,
      "w_bpSaved": 13,
      "w_bpFaced": 22,
      "l_ace": 10,
      "l_df": 9,
      "l_svpt": 234,
      "l_1stIn": 121,
      "l_1stWon": 84,
      "l_2ndWon": 52,
      "l_SvGms": 34,
      "l_bpSaved": 17,
      "l_bpFaced": 27,
      "_created": "Thu, 01 Jan 1970 00:00:00 GMT",
      "_updated": "Thu, 01 Jan 1970 00:00:00 GMT",
      "_etag": "d6e3dcd73f0a27191ce602f0d537e697a022e690",
      "_links": {
        "self": {
          "title": "Atp_match",
          "href": "atp_match/125145"
      "id": 164617,
      "tourney_id": "2017-0308",
      "tourney_name": "Munich",
      "surface": "Clay",
      "draw_size": 32,
      "tourney_level": "A",
      "tourney_date": "2017-05-01",
      "match_num": 297,
      "winner_id": 111202,
      "winner_seed": 0,
      "winner_entry": "",
      "winner_name": "Hyeon Chung",
      "winner_hand": "R",
      "winner_ht": 0,
      "winner_ioc": "KOR",
      "winner_age": 20.95,
      "winner_rank": 78,
      "winner_rank_points": 659,
      "loser_id": 105373,
      "loser_seed": 0,
      "loser_entry": "",
      "loser_name": "Martin Klizan",
      "loser_hand": "L",
      "loser_ht": 190,
      "loser_ioc": "SVK",
      "loser_age": 27.8056,
      "loser_rank": 53,
      "loser_rank_points": 880,
      "score": "6-4 3-6 6-2",
      "best_of": 3,
      "round": "QF",
      "minutes": 987,
      "winner_ace": "4",
      "w_df": 2,
      "w_svpt": 81,
      "w_1stIn": 47,
      "w_1stWon": 36,
      "w_2ndWon": 15,
      "w_SvGms": 13,
      "w_bpSaved": 4,
      "w_bpFaced": 6,
      "l_ace": 8,
      "l_df": 4,
      "l_svpt": 95,
      "l_1stIn": 49,
      "l_1stWon": 28,
      "l_2ndWon": 27,
      "l_SvGms": 14,
      "l_bpSaved": 8,
      "l_bpFaced": 12,
      "_created": "Thu, 01 Jan 1970 00:00:00 GMT",
      "_updated": "Thu, 01 Jan 1970 00:00:00 GMT",
      "_etag": "3e4d5985b601dad5b0229b881b6a0fe52aa06af0",
      "_links": {
        "self": {
          "title": "Atp_match",
          "href": "atp_match/164617"


Posted by admin in API, data-science, 0 comments
Tablifyme API

Tablifyme API

I created an API that creates a text table from a JSON array. Useful to send tables of data over email, Slack or Markdown.

The API is available on Mashape:


curl -X POST --include '' \
-H 'X-Mashape-Key: My-Secret-key' \
-H 'Content-Type: application/json' \
-H 'Accept: application/json' \
--data-binary '{"body":[[1,2,3],[4,"test",6]],"header":["Y","A","B"]}'


|  Y  |  A   |  B  |
|  1  |  2   |  3  |
|  4  | test |  6  |



Posted by admin in API, data-science, 0 comments
Choropleth: percentage of foreigners by US county

Choropleth: percentage of foreigners by US county

Building maps is very easy with Folium. I got my hands on some data from the US Census, specifically, the foreign born population and total population per US county.

To plot it, first download the CSV from the  2011-2015 American Community Survey 5-Year Estimates.

import os
import folium
import pandas as pd

# Load data
data = pd.read_csv('data.csv')

# Add column with ratio of foreign born vs total pop.
data['ratio_foreign_born_vs_total_population'] = 100 * data['Estimate; Foreign born:'] / data['Estimate; Total:']

# Create map
map_1 = folium.Map(location=[39, -96], zoom_start=4)
high_res_county_geo = os.path.relpath('gz_2010_us_050_00_500k.json') # from

# Add choropleth layer
 columns=['Id', 'ratio_foreign_born_vs_total_population'],
 legend_name='Percentage Foreigners(%)',
 threshold_scale=[0, 5, 10, 15, 20, 25]


# Save as index.html'index.html')

The result:

Posted by admin in data-science, python, 0 comments

Redshift UDF Phone Number to Country

Redshift’s UDFs (User Defined Functions) permit to execute, with some limitations, certain Python libraries and custom code.

In my case, I wanted to find a way to extract the country code from a phone number in E.164 format. UDFs are a perfect fit for this, the implementation in SQL would most certainly require creating custom views, and hacking your way around, while in python, we can use the library phone-iso3166

>>> from import *
>>> phone_country('+1 202-111-1111')
>>> phone_country('+34645678901')

To upload a library to Redshift, we first need to check it follows the structure:


In our case, phone-iso3166 is already in that structure. Now we need to zip the library:

tar -xvzf phone-iso3166.tar.gz
zip -r phone-iso3166

With our zipped library, we need to upload it to S3. I did this part manually into a bucket named s3://redshift/custom-udfs/

Now, connect to Redshift and issue:

CREATE LIBRARY phone_iso3166 LANGUAGE plpythonu 
FROM 's3://redshift/custom-udfs/' 
CREDENTIALS 'aws_access_key_id=<your-aws-key>;aws_secret_access_key=<your-aws-pass>' 
region as '<your-region>';

Last step:

CREATE OR REPLACE FUNCTION udf_phone_country (phone_number VARCHAR(64)) RETURNS VARCHAR(64) IMMUTABLE as $$ 
from import phone_country, InvalidPhone
    return phone_country(phone_number)
    return None
$$ LANGUAGE plpythonu;

You should be all set to use this in your queries:


That returns:



Posted by admin in data-science, 0 comments

Redshift Force Drop User

Ever tried dropping a user in Redshift only to discover that user “user_1” cannot be dropped because the user has a privilege on some object.

That’s not of great help Redshift. Luckily for us, there is a solution. Kamlesh Gallani posted a response about revoking permissions for tables and schemas that I user might still be assigned to, along with changing the ownership of tables that the user might have created. After that, dropping the user is straightforward.

I created a python snippet with his response that might save you a few minutes. You will need the view v_get_obj_priv_by_user from amazon-utils, simply create the view on your Redshift cluster, then, copy and paste this Python script into your favorite editor (e.g. Sublime Text) fill in the connection information and the list of users to drop and execute it.

import psycopg2
# Connect to Redshift
conn = psycopg2.connect(host="", dbname="", user="", password="", port="5439")
conn.autocommit = True
cursor = conn.cursor()
# List of users to drop
users = ['user_to_drop_1', 'user_to_drop_2']
# New owner, used when changing ownership
new_owner = 'new_user'
# Templates to change ownership, revoke permissions, and drop users
change_ownership = "select 'alter table '+schemaname+'.'+tablename+' owner to %s;' from pg_tables where tableowner like '%s'"
revoke_schema_permissions = "select distinct 'revoke all on schema '+schemaname+' from %s;' from admin.v_get_obj_priv_by_user where usename like '%s'"
revoke_table_permissions = "select distinct 'revoke all on all tables in schema '+schemaname+' from %s;' from admin.v_get_obj_priv_by_user where usename like '%s'"
drop_user = "drop user %s;"
for user in users:
    # Change ownership
    cursor.execute(change_ownership % (new_owner, user))
    for r in cursor.fetchall():
        print("Executing: %s" % r[0])
    # Revoke schema permissions
    cursor.execute(revoke_schema_permissions % (user, user))
    for r in cursor.fetchall():
        print("Executing: %s" % r[0])
    # Revoke table permissions
    cursor.execute(revoke_table_permissions % (user, user))
    for r in cursor.fetchall():
        print("Executing: %s" % r[0])
    # Drop user
    cursor.execute(drop_user % (user))
Posted by admin in computer, data-science, 0 comments

SET Game Visual Solver

For those of you that have never played SET® before, SET® is a fast-paced game involving 81 unique cards with different combinations of four characteristics (color [red, green, or purple], shape [diamond, oval, or squiggle], shading [hollow, shaded, or filled], and number [1, 2, or 3]). The point of the game is to find as many 3-card “sets” as possible. If the three cards are either all the same or all different in each of the four characteristics, you know you’ve found a set. Twelve cards are presented on the table at a time, and are continually replenished when a set is taken out, until no cards remain in the deck. Sometimes, however, there are no sets in the twelve cards, in which case 3 more cards are dealt. But, how do you know if there really is no set? I aim to solve this problem using image processing software and eventually create a set recognition mobile application.

So, what determines a set? Let’s take the example below (although it is probably one of the most difficult sets in the game). Looking at the cards in terms of the four characteristics, we have the following: color – red, green, and purple (all different); shape – diamond, oval, and squiggle (all different); shading – filled, hollow, shaded (all different); number – 1, 2, 3 (all different). If the cards were exactly the same, but all red, this would also be considered a set, because, in terms of color, they are all the same.


There’s the simplified breakdown of how I used an image processing algorithm to determine sets from ordinary jpeg images:

Step 1: Convert the image to black and white and invert the image to calculate the number of shapes per card

Step 2: Compare the pixel surface area of one of the symbols on each card to determine the shape

Step 3: Normalize the RGB color of the images with an equation to figure out the color

Step 4: Convert the images to gray scale and apply a border detection to determine the shading

Step 5: Run the simple algorithm that deduces the sets

For more details, check out the paper I wrote.

Posted by admin, 0 comments

BlackJack Hitting vs. Standing chances

Who has never thought about beating the house in a Casino? As the MIT students said in their documentary Breaking Vegas, it’s not all about the money; but about the feeling of beating a huge corporation in their own game.

It’s very easy to find online the basic strategy for BlackJack, i.e., the correct decision when the only information that you have about the game are your cards, and the dealers facing card. Understanding by correct decision, the one that would minimize your losing chances. Even with this basic strategy the odds against the house are negative, but greatly reduced to a mere -3 $ per 100$ played (aprox). Of course the Breaking Vegas students didn’t just played basic strategy, they also counted cards and played in a team not to raise any attention from the Big Brother.

For this post, I wrote a small Python notebook that computes what are the chances for the different combinations of cards and strategies in their most simplified way, standing and hitting.

The full notebook:

And the results in two images:


On the Y-Axis, the starting score of the player, and on the X-Axis, the dealer’s facing up card (11 being the Ace). The color and values represent the chances of winning if the following the strategy.

Posted by admin, 0 comments