You can keep things embedded in the image file if you really want, even if you change the data format to include for ex other layers etc, but that just makes it really hard to do things without specialized tools.
Oh okay then I vote for a format which is just a zip file containing many PNGs (background.png, wallmask.png, or maybe layer###.png, possibly even animated layers for example for snow or rain, or even animated everything, like an animated wallmask so you could have moving platforms), a config file (that could specify parallax for each layer and whatever other parameter necessary) and either a .ENT file as GB uses now or we find a better way to store entities (I was thinking of a colour-coded PNG but then there's the problem of entity overlap, so no). This way you could do most everything using an image editor and text editor and a ZIP thing, save maybe for placing entities, and you could go from basic BG/WM/entities if you're lazy to a very fancy multi-layer parallaxed and animated world, and you could even probably throw some of your own sounds with it, like a global loop (for an ambient sound like the sound of rain and thunder (synchronised with the animated layer that represents thunder for instance), local sounds defined in either the map's config file or their own config file by their position and spatial range/loudness/whatever, maybe even event-based sounds (like for walking on snow). Maybe you could even throw some scripts in there (LUA maybe?). That'd be a pretty nice format. One small technical detail though is the animations, GIF kinda sucks, but what else is there in 24 bits?