You are given several logs that each log contains a unique id and timestamp. Timestamp is a string that has the following format:
Year:Month:Day:Hour:Minute:Second, for example,
2017:01:01:23:59:59. All domains are zero-padded decimal numbers.
Design a log storage system to implement the following functions:
void Put(int id, string timestamp): Given a log's unique id and timestamp, store the log in your storage system.
int Retrieve(String start, String end, String granularity): Return the id of logs whose timestamps are within the range from start to end. Start and end all have the same format as timestamp. However, granularity means the time level for consideration. For example, start = "2017:01:01:23:59:59", end = "2017:01:02:23:59:59", granularity = "Day", it means that we need to find the logs within the range from Jan. 1st 2017 to Jan. 2nd 2017.
put(1, "2017:01:01:23:59:59"); put(2, "2017:01:01:22:59:59"); put(3, "2016:01:01:00:00:00"); retrieve("2016:01:01:01:01:01","2017:01:01:23:00:00","Year"); // return [1,2,3], because you need to return all logs within 2016 and 2017. retrieve("2016:01:01:01:01:01","2017:01:01:23:00:00","Hour"); // return [1,2], because you need to return all logs start from 2016:01:01:01 to 2017:01:01:23, where log 3 is left outside the range.
This solution is based on converting the given timestap into a number. This can help because retrieval of Logs lying\nwithin a current range can be very easily done if the range to be used can be represented in the form of a single number.\nLet\'s look at the individual implementations directly.\n
put: Firstly, we split the given timestamp based on
: and store the individual components obtained into an array.\nNow, in order to put this log\'s entry into the storage, firstly, we convert this timestamp, now available as individual components \nin the array into a single number. To obtain a number which is unique for each timestamp, the number chosen is such that \nit represents the timestamp in terms of seconds. But, doing so for the Year values can lead to very large numbers, which could \nlead to a potential overflow. Since, we know that the Year\'s value can start from 2000 only, we subtract 1999 from the Year\'s value \nbefore doing the conversion of the given timestamp into seconds. We store this timestamp(in the form of a number now), along with the \nLog\'s id, in s , which stores data in the form .
retrieve: In order to retrieve the logs\' ids lying within the timestamp and , with a granularity , firstly, we \nneed to convert the given timestamps into seconds. But, since, we need to take care of granularity, before doing the conversion, we \nneed to consider the effect of granularity. Granularity, in a way, depicts the precision of the results. Thus, for a granularity corresponding to \na Day, we need to consider the portion of the timestamp considering the precision upto Day only. The rest of the fields \ncan be assumed to be all 0\'s. After doing this for both the start and end timestamp, we do the conversion of the timestamp with the \nrequired precision into seconds. Once this has been done, we iterate over all the logs in the to obtain the ids of those \nlogs which lie within the required range. We keep on adding these ids to a list which is returned at the end of this function call.
putmethod takes time to insert a new entry into the given set of logs.
retrieve method takes time to retrieve the logs in the required range. Determining the granularity takes time. But, to find the logs lying in the required range, we need to iterate over the set of logs atleast once. Here, refers to the number of entries in the current set of logs.
This method remains almost the same as the last approach, except that, in order to store the timestamp data, we make use \nof a TreeMap instead of a list, with the key values being the timestamps converted in seconds form and the values being the \nids of the corresponding logs.\n
put method remains the same as the last approach. However, we can save some time in
retrieve approach by making use \nof
tailMap function of TreeMap, which stores the entries in the form of a sorted navigable binary tree. By making use of this function, instead of iterating over all the timestamps in \nthe storage to find the timestamps lying within the required range(after considering the granularity as in the last approach),\nwe can reduce the search space to only those elements whose timestamp is larger than(or equal to) the starting timestamp value.
The TreeMap is maintained internally as a Red-Black(balanced) tree. Thus, the
putmethod takes time to insert a new entry into the given set of logs. Here, refers to the number of entries currently present in the given set of logs.
retrieve method takes time to retrieve the logs in the required range. Determining the granularity takes time. To find the logs in the required range, we only need to iterate over those elements which already lie in the required range. Here, refers to the number of entries in the current set of logs which have a timestamp greater than the current value.
Analysis written by: @vinod23\n