You use a 3D perlin noise function and sample it using coordinates as input. That way you get a float value between 0 and 1 per coordinate, based on which you can decide to place a block or not. For example, place a block if the value is below 0.5. To generate more reasonable terrain, you then want to mix different noise functions, with different scales and offsets. Then you probably want to scale the output based on the y-coordinate to get, on average, higher values the higher up you are and lower values the lower you are, resulting in a surface with air above and denser terrain below.
This is the general approach, but it gets a lot more complicated when you consider performance. Filling a 16x16x256 chunk, like in minecraft, roughly half with single block objects causes the creation of 816256 = 32768 individual block objects, each having its own drawcall. This gets very slow very fast. What i’m getting at is this: Block worlds are actually not made up of blocks. What you want to do instead, in order to reduce drawcalls, is create as few single mesh objects as possible. In practice this means that you will want to create a mesh, using squares, that puts a square only where a side of a block is exposed to air - and then combine all those squares into a single mesh. This makes a single chunk a single object - large enough such that you are not bottlenecked by thousands of drawcalls, but small enough to allow for redrawing the mesh in a reasonable amount of time, if it was changed by the player.
However, this approach causes another problem. You cant have more than 1 texture per mesh. There are two sollutions for this problem. You either create one mesh per “block type” in the chunk, which again increases drawcalls by a lot when many materials are used (especially considering self-build structures like houses), or you use a texture atlas, combining all the textures you are going to use into one large one, and then using only part of it based on which block-type the square you are texturing belongs to.
Talking about performance, you will also want to multithread the whole process so that the game does not freeze whenever a chunk is being generated or updates, and to generally speed up world generation (which is an ongoing process, unlike in most games), as well as update times for editing chunks.
Procedural / voxel based worlds are probably the single most hardest way to create gameworlds, so i wouldnt recommend it to beginners. What looks very simplistic, is actually a very complex topic.
That said, if you have some background in programming or are absolutely certain you have enough (weeks to months of) dedication to go from zero to hero on this topic, then you can always try. It was actually one of the first things i worked on in gamedevelopment as well, but i already had a background in computer science and it was still far from easy and i’m still working on it.