Every row of you dataset contains some info. The more columns are in dataset, the more info is stored in one single row.
First logical conclusion - you can’t train your model if you have only 1 record in a batch. Well, actually you can (have physical possibility), but the best your model can do, it can just memorize all data in 1 row. No averaging - no training.
If you have only 5-10 variables in dataset, your model will need at least 15 rows to make a poor/unreliable prediction, and at least 50-100 rows for a better prediction.
If you have about 100+ variables, amount of information per one record is greater, and you can use less rows to train your model.
My guess - in general minimum batch size starts from ~13-15 rows. Maximum - it depends on you dataset and on memory of your computer. The greater you batch is, the more memory will be required to make matrix computations.
One more observation - training with a greater batch size goes slowly (seems that outliers do not have enough weight in bigger samples than in smaller ones; model recognizes them as outliers than as normal data)
Also, if your validation set contains only ~110 rows, I think its better to make batch at least two times less then len(vaidation_set).
As everywhere you need to find a balance