Last year when I was interviewing with Amazon, the first question that I was asked is to write a function `get_number_of_islands(matrix)`

. This was the first time I had seen this problem. It was hard for me. In retrospect, I want to share what I had learned and approaches to solving this problem.

## Problem

Given a 2 dimension array `matrix`

of `0s`

and `1s`

, count the number of islands of `1s`

. An island is surrounded by a group of adjacent cells that are all `1s`

. A cell can only be adjacent to each other horizontally and vertically.

### Example

```
input: binaryMatrix = [ [0, 1, 0, 1, 0],
[0, 0, 1, 1, 1],
[1, 0, 0, 1, 0],
[0, 1, 1, 0, 0],
[1, 0, 1, 0, 1] ]
output: 6 # there are six islands
```

## Approaches

First, you should ask the interviewer questions for clarification, paraphrase the problem to make sure you understand it correctly, and/or state any assumptions.

- Does the matrix have the same number of columns or rows? (Interviewer: it should not matter)
- We assume that the cell can be adjacent only horizontally and vertically,
**not diagonally**? (Correct) - The function should handle the empty matrix and out of bound data access correctly.

Second, now you can talk about the processes to solve this problem out **loud**.

Okay, so we want to *iterate over each cell* in the matrix. Then we can check if the cell is `1`

or `0`

.

It is a `0`

, it is not an island and we can continue. If it is a `1`

, it is guaranteed that we found **1** island. (sound like a base case? ) At this point, we need a mechanism to make this cell as **visited**, so next time around we can skip it to avoid repeated counting. After that, we want to look at all the **adjacent** cells to repeat the process (recursion).

**Pseudocode**

```
def main():
visited [] # array to keep track of visited cell
num_islands = 0 # start with zero
for each row, col in matrix
num_island += get_number_of_islands(matrix, row, col) #return 1 if it is an island
return num_island
def get_number_of_islands(matrix, row, col, visited):
check if row and col is out of bound of the matrix
check if we already visited the cell with row, col
check if the cell is 0
=> return 0
mark the cell as visited in the visited array
recursive call get_number_of_islands() on each adjacent cell
return 1
```

## Solution (Python)

```
def get_number_of_islands(binaryMatrix):
rows = len(binaryMatrix)
cols = len(binaryMatrix[0])
# you can use Set if you like
# or change the content of binaryMatrix as it is visited
visited = [[0 for col in range(cols)] for r in range(rows)]
number_of_island = 0
for row in range(rows):
for col in range(cols):
number_of_island += get_island(binaryMatrix, row, col, visited)
return number_of_island
# get a continuous island
def get_island(binaryMatrix, row, col, visited):
if not is_valid(binaryMatrix, row, col)
or visited[row][col] == 1 or binaryMatrix[row][col] == 0:
return 0
# mark as visited
visited[row][col] = 1
get_island(binaryMatrix, row, col + 1, visited)
get_island(binaryMatrix, row, col - 1, visited)
get_island(binaryMatrix, row + 1, col, visited)
get_island(binaryMatrix, row - 1, col, visited)
return 1
def is_valid(binaryMatrix, row, col):
rows = len(binaryMatrix)
cols = len(binaryMatrix[0])
return row >= 0 and row < rows and col >= 0 and col < cols
```

## Extra

- What is the time and space complexities?
- Question variation: Find the island with the largest size? In this case above, return
`5`

- This problem can also be solved without using recursion as well. But it is required a Queue or Stack to keep track of adjacent cells to visit.

## Notes

This question can be hard and make an interviewee being stuck (including myself) due to a number of reasons.

- unfamiliarity with basic
**Graph**data structure and the concepts of**DFS**or**BFS**traversal. This is considered a graph with the adjacency matrix. - being stuck in handling the base case and recursive case.
- interview time constraint

*Cover image is obtained from unsplash.com

## Discussion

My approach to deal with this kind of problems :

For this specific task my solution is :

Use a double-buffered row (matching the matrix row length), in which I'll put "colours" (a natural will do fine) for each "1" found. Use an invalid colour (-1 for naturals) for the "0"s in the row. You fill one row buffer form left to right, then you swap buffers. To know what colour to assign, you have to check :

1) Is the matching matrix cell land or water ? If it's water, use invalid_colour

2) Else use colour of the cell directly to the left, or the colour of the cell directly above (that's why you need to keep 2 colour rows, but you don't need more). If both are invalid, use a new colour.

Do all the rows until you reach the bottom of the matrix. Initialize the colour buffers with invalid_colour.

By doing that, you will have "coloured" all the landmasses. But it may happen that a single island has 2 colours (Try with a U-shaped island). To avoid counting double you'll use a map that store, for each colour, the "name" of the island. (of course, the "names" too can be naturals). The idea is : when you add a new colour (see step 2 above), also add it to the map with a new island name. And when you detect a 2 different adjacent colours, change the name mapped to one of the colours, so they both map to the same name (by doing that, you're "freeing" an island name, that you can store for reuse later). Detecting adjacent colours is quite easy. In step 2 of the algorithm, if the cell above, and the cell on the left both have a valid colour, and they're different, then you should "merge" names.

At the end, you just want to count the number of different names, which is easy if you used integers, and reused "free"d ones : that's your "highest" name, minus the number of free'd ones remaining at the end.

This would be quite efficient if the matrix is given as a continuous stream of rows of booleans. If the matrix is given as columns, you may want to adapt the algorithm.

This is pretty detail. To be honest, I have to read it a couple times for it to make sense. I just want to do good enough for now to pass an interview. :-p

Do you mean it's surrounded by adjacent 0's?

a correct way to say it would be "an island is a group of adjacent cells that are all 1s, surrounded by 0s or array's margins"

This picture should clarify what it means. Each color group represents each island.

Thanks, that's what I thought it meant, but wasn't sure.

I was asked this question in an interview(not Amazon). Initially I thought that I won't be able to solve it, but was able to with some hints from the interviewer. My solution was in C# and was similar to yours.

awesome. It always felt great to be able to find the solution in the end. Did you end up getting that job?

Umm, call me dumb, but according to the definition you have provided, there is only one island - centered at the second row fourth column.

The solution rather says an island is an isolated one, or a group of adjacent ones, diagonals excluded; that would give six as a result indeed.

Ha! I actually asked the exact same challenge by Amazon few months back. Sadly unable to solve it during the interview itself. But 1.5 hours afterwards, I solved it and then sent my solution to them. Didn't land the job though :)

I think you could also solve it using Flood Fill

Algorithm, which is similar to your approach. Thank you for sharing!

I have no idea what Flood Fill algorithm is. Will take a look at it. Thanks

Tbh, this is a very basic question, solvable in linear time with graph traversal algorithms like BFS/DFS.