CodeFights Solves It: hostnamesOrdering

CodeFights Solves It: hostnamesOrdering

The Challenge of the Week that ended on March 21 was hostnamesOrdering. Only 72 CodeFighters submitted solutions – this was a tricky one! In this breakdown, I’m going to walk you through my thought process in writing a solution.

In this database challenge, the task is to take a table of numeric ids and hostnames, and return a sorted version of the table. The challenge is that we have to order the hostnames by the reverse hostnameThe hostname is a string of domains separated by periods, such as codefights.com or www.github.com. The domains don’t include any protocols (such as http://), nor do they include any paths (such as codefights.com/forums). The general form of a hostname is domain1.domain2. … .domainN. The reversed hostname is domainN. … .domain2.domain1. For example:

Hostname...has the reverse hostname
www.codefights.comcom.codefights.www
dev.mysql.comcom.mysql.dev
web.mit.eduedu.mit.web
codepen.ioio.codepen

Note that the list above is ordered by the reverse hostname. We are told that there are at most 3 domains in any given hostname that appears in this challenge.

Getting started

Let’s start with an example of the table hostnames:

idhostname
1workbench.mysql.com
2codefights.slack.com
3codefights.com
4snarknews.info
5sololearn.com
6dev.mysql.com
7codepen.io
8com

Let’s suppose for a moment that we have a way of generating the reversed hostnames, which will actually be the hardest part of the challenge.

idhostnamereversed
1workbench.mysql.comcom.mysql.workbench
2codefights.slack.comcom.slack.codefights
3codefights.comcom.codefights
4snarknews.infoinfo.snarknews
5sololearn.comcom.sololearn
6dev.mysql.comcom.mysql.dev
7codepen.ioio.codepen
8comcom

The query we want to return is:

which would return:

idhostname
8com
3codefights.com
6dev.mysql.com
1workbench.mysql.com
2codefights.slack.com
5solonews.com
4snarknews.info
7codepen.io

Reversing the domain name

We won’t actually construct an explicit reversed column. Instead, we will generate the pieces of the domain we want on the fly. To do this, we will be using the SUBSTRING_INDEX function. From the documentation, we see that SUBSTRING_INDEX( string, delim, count) returns a substring.

If count is positive, SUBSTRING_INDEX( string, delim, count) returns the substring from the beginning to the countth occurrence of delim. For example:

If count is negative, SUBSTRING_INDEX finds the |count|th occurrence of delim from the right, and returns the substring from there to the end of the string. For example:

Using the same example we’ve been working with, here is a query that splits the hostname into three pieces. Using SUBSTRING_INDEX, we can extract the top level domain, the midlevel domain, and the lowest level of domain just by changing the index value.

This would work if every domain has exactly three pieces. Unfortunately, some domains only have one or two pieces. What is actually returned is:

idhostnametopmid
3codefights.comcomcodefights.com
8comcomcom
6dev.mysql.comcommysql.com
1workbench.mysql.comcommysql.com
2codefights.slack.comcomslack.com
5solonews.comcomsolonews.com
4snarknews.infoinfosnarknews.info
7codepen.ioiocodepen.io

The first two rows are in the wrong order. They match at the top level (i.e., they are both .com domains). The com domain has no lower level, but the SUBSTRING_INDEX just returns the entire string com again. Since codefights.com is earlier in the alphabet than com, the table puts codefights.com first. What should happen is that codefights should be compared to the empty string, and com should appear first.

One trick to fix this problem is to prepend the hostnames with ‘…’, guaranteeing that there are three periods in each domain. So our query is now:

Writing down just the first four rows, this query returns:

idhostnametopmid
8comcom.com
3codefights.comcomcodefights.com
6dev.mysql.comcommysql.com
1workbench.mysql.comcommysql.com
…….……..…….……..

This has everything in the right order!

A problem and a workaround

In principle, we just have to eliminate the top and middle column, so we are only reporting the id and hostname. Before making this change, we can try running this in CodeFights. When we do it, we run into a problem: the CodeFights MySQL engine returns workbench.mysql.com before dev.mysql.com!

This seems to be a bug; running MySQL on a local machine returns dev.mysql.com before workbench.mysql.com. However, ordering by the hostname on CodeFights returns the opposite. When using ORDER BY top, mid, hostname, workbench.mysql.com appears first. This seems wrong, since both these hostnames have identical top and mids, so the ordering should be determined by hostname.

Part of being a programmer is working around issues in any software! The problem we are encountering comes from doing multiple orderings. Our workaround will be to concatenate the top, middle, and hostname into one string. We will separate the top, middle, and hostname by a space, because spaces cannot occur in a domain and spaces appear earlier in ASCII than any letter.

So our workaround is:

This orders dev.mysql.com and workbench.mysql.com correctly.

Eliminating columns

Now we need to eliminate the top and mid columns. We could do a subquery, but instead we will actually construct the top and the middle strings as part of the ORDER BY clause.

Our solution ends up looking like this:

This solution only selects the desired columns. Notice that in the ORDER BY we have a single string, made from the “top” level domain, a space, the “middle level” domain, a space, and then the hostname. This gets around the implementation bug, and gets us a solution with 188 characters.

On a local MySQL database, we don’t need the final concatenation, and we could use multiple ordering to get:

Going further

As a code golfing exercise, we have some compact solutions. As a coding exercise, you might be interested in trying to solve the general case where we don’t have a limit on the maximum number of domains in a hostname!

This takes us back to our original solution:

where the interesting exercise is to create a custom function reverse_hostname. If you get stuck, this StackOverflow post might be useful.

Tell us…

How did you tackle this challenge? What do you think about the approach I took in solving it? Let us hear from you in the CodeFights forum!


Also published on Medium.

Comments are closed.