ARTICLE AD BOX
I’m having an issue with MobileNetV3 image embeddings:
I compute embeddings for original images in Python and store them in a database.
On Android Kotlin, when a user uploads an image, I also compute its embedding using MobileNetV3 and try to match against the database.
However, I found that for the same image, the pixel values read in Python and Kotlin are completely different, causing mismatched embeddings and incorrect results.
I’ve tried:
BitmapFactory.decodeFile / decodeStream with ARGB_8888
Ignoring alpha by drawing onto a black background
Using getPixels to extract RGB
But I still cannot get pixel values that match Python’s np.array(img.convert("RGB")).
My Python code example:
def generate_vectors(self, image_paths: List[str]) -> List[List[float]]: vectors = [] for path in image_paths: if not os.path.exists(path): print(f"Warning: {path} not found, skipped") continue img = Image.open(path).convert("RGB") input_tensor = self.preprocess(img) self.interpreter.set_tensor( self.input_details[0]["index"], input_tensor) self.interpreter.invoke() embedding = self.interpreter.get_tensor( self.output_details[0]["index"]) embedding = embedding.squeeze() norm = np.linalg.norm(embedding) if norm > 0: embedding = embedding / norm vectors.append(embedding.tolist()) return vectors def preprocess(self, img: Image.Image) -> np.ndarray: """ process image """ img = img.resize((self.width, self.height)) img_array = np.array(img).astype(np.float32) img_array = (img_array / 127.5) - 1.0 img_array = np.expand_dims(img_array, axis=0) return img_arrayMy Kotlin code example:
val bitmap = this.assets.open("example.png").use { BitmapFactory.decodeStream(it) } val mobileNetV3 = MobileNetV3(this) val embedding = mobileNetV3.encodeImage(bitmap) import android.content.Context import android.graphics.Bitmap import android.graphics.Color import org.tensorflow.lite.DataType import org.tensorflow.lite.support.common.ops.NormalizeOp import org.tensorflow.lite.support.image.ImageProcessor import org.tensorflow.lite.support.image.TensorImage import org.tensorflow.lite.support.image.ops.ResizeOp import java.nio.ByteBuffer import kotlin.math.sqrt class MobileNetV3(context: Context) : ImageEmbedder { private val model = MobilenetV3TfliteLarge100224FeatureVectorMetadataV1.newInstance(context) private val imageProcessor = ImageProcessor.Builder() .add(ResizeOp(224, 224, ResizeOp.ResizeMethod.BILINEAR)) .add(NormalizeOp(127.5f, 127.5f)) // [-1, 1] .build() override fun encodeImage(bitmap: Bitmap): FloatArray { val tensorImage = TensorImage(DataType.FLOAT32) tensorImage.load(bitmap) val processedImage = imageProcessor.process(tensorImage) val outputs = model.process(processedImage) val vector = outputs.featureAsTensorBuffer.floatArray.copyOf() l2Normalize(vector) return vector } override fun close() { model.close() } private fun l2Normalize(vector: FloatArray) { var sum = 0f for (v in vector) { sum += v * v } val norm = sqrt(sum) if (norm > 0f) { for (i in vector.indices) { vector[i] /= norm } } } }I want Android Kotlin to read the image pixels exactly the same as Python, otherwise the MobileNetV3 embeddings won’t match.
Is there a way to read RGB pixels on Android like Python does, or a better approach to achieve this functionality?
