# SVHN Data Preparation

In previous post, I talked about how to use the h5py package to read MAT-file that contains bounding box information of the SVHN dataset. After we successfully reading the bounding box data, we can start to train a neural network for the SVHN recognition task. The bounding box data provided in the dataset is the position, size and label of each digit in the image, which means for a 4-digit house number, there are 4 boxes in total and each one is just bounding 1 digit, as shown below:

What I would like to achieve is to train a model that can recognize the complete house number at one go rather than the individual digits and then combine the result. That is because for example, even though the model can successfully recognize 3 out of 4 digits of a 4-digit house number, the combined result is still wrong and useless. Therefore I need to create an encircling bounding box for each image, which includes all the individual boxes of that image so that the big box is now bounding all the annotated digits.

The main process is to read the information of all the bounding boxes, and then calculate the position and size of the circumscribed bounding box:

**(x, y)**correspond the minimum of all the**x**and**y**, which denote the top-left corner, of a bounding box.**width**is the maximum of the sums of the**x**and the**width**of each box.**height**is the maximum of the sums of the**y**and the**height**of each box.

The step is translated into the following function. Note that we also change the label of zero from ‘10’ to ‘0’.

```
def merge_bbox(f, idx=0):
meta = get_img_boxes(f, idx)
left = min(meta['left'])
top = min(meta['top'])
width = max(map(add, meta['left'], meta['width'])) - left
height = max(map(add, meta['top'], meta['height'])) - top
labels = [x if x != 10 else 0 for x in meta['label']]
bbox = {'left': left, 'top': top, 'width': width, 'height': height, 'labels': labels}
return bbox
```

And here are some of the examples showing each individual bounding boxes and the circumscribed bounding box.

The source code of the bounding boxes information preparation can be found here.