Just calculate if it will cost you more to pay for another external harddisk or to pay for months of additional electricity (in both time and money)...
Try StreamFab Downloader and download from Netflix, Amazon, Youtube! Or Try DVDFab and copy Blu-rays! or rip iTunes movies!
+ Reply to Thread
Results 541 to 570 of 782
Thread
-
-
Of course, I know about that trade-off. Which is why I'm interested to know if x265 can become faster - if it can, then the electricity bills (and time spent) can come down a lot.
As of now, it is better to invest in another hard disk, than to transcode into x265. That is why I am saying that for x265 to be a useful product, it should give more "bang for the buck" - ie, better value than the time and energy required to use it.
If it is cheaper to buy a new hard disk than to use x265, then what is the point of making x265? -
To create an encoder which is more efficient regarding compression when a limited size or bitrate matters more than the encoding efforts. UHD-TV is one known target, the UHD successor of Blu-ray will be another. But web videos can be a useful target too. About everything where you can't simply use more bitrate, but still want a convenient or better ratio between resolution and quality.
Commercial providers of HEVC videos will invest in several computers with modern powerful CPUs to be able to convert several videos in parallel. How many could you afford, as a private person? ... Developers will put efforts in making the encoder efficient, but still, quality result will take their time.
By the way, x264 will possibly utilize a multi-core PC with a high number of cores better than x265, because HEVC encoding has a lot more dependencies limiting the parallelization. But you may still try to run several instances of x265 in parallel to utilize many cores better.Last edited by LigH.de; 20th May 2015 at 08:46.
-
This is such a loaded question but here goes:
Most of the tests that supposedly prove that x264 was better than hardware encoders were/are flawed for a variety of reasons, including using previously compressed sources encoded with one of the encoders being tested; hardware encoders that couldn't be fully exploited because the software was poorly written and feature incomplete.
Haswell's quicksync, with all the settings maxed out, is capable of matching x264 with the fast preset, Nvidia's nvenc can match x264 with the fast preset (I'm talking quality wise) and nvenc can match x265 so long as we're not talking about that RDO setting being used. It's anyone's guess what the test results would be if someone actually wrote an encoder that used all of nvenc's features.
Hardware encoders have been used in broadcasting for years, decades really, they just haven't been available to the general public. What has been available, such as CUDA and OCL based encoders have been so poorly written and so feature incomplete that they have turned off the general public and led them to believe that hardware encoders are somehow incapable of high quality.
The biggest drawback to hardware encoders is that it relies on some independent developer writing the code to exploit them and it ties you to a specific hardware configuration, whereas with a pure software based encoder you're free to choose whatever hardware config you desire, for many people that's a big selling point and the fact that with a pure software encoder support is going to exist within a wide variety of apps. -
... if someone actually wrote an encoder that used all of nvenc's features.
What has been available, such as CUDA and OCL based encoders have been so poorly written and so feature incomplete that they have turned off the general public and led them to believe that hardware encoders are somehow incapable of high quality.
Hardware encoders have been used in broadcasting for years, decades really, they just haven't been available to the general public.
Cu Selurusers currently on my ignore list: deadrats, Stears555 -
There is only one x265 code base. It's available to all under the GPL v2, and it's the same code base available under a commercial license. There are no private forks.
MCW reps have publicly stated that the non-public code base, the one which is funded by corporate sponsorships and licensing agreements, gets priority with regard to development, bug fixes and new features and at some point down the road some of the improvements may be pushed into the public branch.
We have other applications which can utilize one or more instances of x265, and add additional functionality to x265 (for example, video processing prior to encoding), but none of our other software applications or libraries contain any functionality derived from x265 or x264.
I personally wouldn't hold my breathe waiting, MCW claims to have an OCL accelerated OCL variant that has been used in a broadcast environment, such as the Olympics, but they also state that this variant is only available via a licensing agreement and it certainly seems that it is a different code base from the OCL patch that's publicly available for x264.Last edited by x265; 21st May 2015 at 00:12.
-
You shouldn't expect any big, sudden jumps in speed, but there are still plenty of good ideas and plans on our development roadmap that will continue to improve speed from a pure CPU software perspective. There are a variety of other acceleration possibilities in the future.
-
Great, that is what I was hoping.
In your previous post, you said that your commitments to commercial sponsors will take precedence over requests from individuals. So could you tell me whether speeding up CPU based encoding is a prority as of now? Or is that something you will work on only after other features have been implemented or refined? -
As I think everyone can see, we listen to our users and we care what they think. But when our developers are working on contractually committed improvements, it's not always possible to pull them off those tasks to implement a user-requested feature if it means that we won't deliver on our contractual commitments on time. Fortunately, the things that the companies sponsoring our efforts care about are generally things that you care about - faster performance and higher compression efficiency (higher visual quality at any bit rate). Hopefully you've noticed that we have found ways to squeeze in features that are needed by the open source community along the way.
-
Here's a question I have been pondering, if you guys started from scratch, could you code a hevc encoder that ran purely on a gpu? In other words, not using any algorithms, code or functions from x264 or x265, you developers sat in front of a computer and decided to code an encoder that ran entirely on a gpu, either via CUDA, OCL or DX12, could you do it.
I understand that certain parts, such as entropy coding, are purely a serial process but other parts such as ME are easily run in parallel, wouldn't having the entire encoder run on a gpu offset any bottlenecks of running the serial portions on the gpu? Wouldn't you also eliminate the need to copy data from main memory to vram since modern gpu's are capable of writing directly to vram without the need to buffer to main memory first? Couldn't you treat a gpu like a 1024bit wide SIMD unit and run the parts that are SSE/AVX2 accelerated even faster? -
Okay,...
I. GPUs have far more processor cores than CPUs, but because each GPU core runs significantly slower than a CPU core.
-> running a task on a GPU which can only be solved using one core is a real slow down
II. The instruction set offered by a GPU is significantly limited (otherwise it wouldn't be a GPU, but a CPU )
-> so to do something the instruction set doesn't directly offer you have to try to implement it with the instructions you have, the general movement to extend this wrapper is normally called GPGPU (see: General-purpose computing on graphics processing units).
One of the main problems sometimes parts of the things you want to do can be done in a serialized way (can't be split into multiple parts for solving).
Yes, this is the problem I mentioned in I. . So if you emulate/replace a cpu instruction by multiple gpu instructions and even one of them can only be done on a single core (while the others have to wait for the result) the whole thing might be a lot slower than when you do it with a cpu.
III. Communication between different parts of a computer are often really slow, for example sending information from the cpu to the gpu memory.
IV. Porting ideas and algorithms to different architectures isn't easy.
When looking at specifications in example video formats you normally only have a description on:
a. what the format of the output should look like
b. how some of the main ideas are meant to be done
Nice thing is, you often got a reference implementation which is a proof of concept implementation. This means it's not meant to be fast, but more to be 'more easily' understandable, so the way things are implemented there are often not really meant for real world usage.
In example if the task was to add the numbers from 1 to 100 a reference implementation might really add up the numbers 1 to 100, where as the same could be archived by calculating 100*(100+1)/2.
-> Depending on the environment you write code for a lot of thinking about stuff has to be done how things could be done faster. Depending on how much stuff you can do in parallel and how fast it is to write stuff into the memory and get the result out of it the 100*(100+1)/2 way might not be the fastest, there might even be other ways to get to the result which can be faster. So porting stuff to gpu really can be a pain and sometimes requires some out of the box thinking.
Side note:
There is also the P vs. NP problem,...
It is unknown whether NC = P, but most researchers suspect this to be false, meaning that there are probably some tractable problems that are "inherently sequential" and cannot significantly be sped up by using parallelism.
So if P and NP are really not equal, you will always have some bottlenecks,...
--------------------
I hope this helps a bit to understand that even if something like writing an encoder completely for the gpu is possible (which I think theoretically is possible) the chances that it might be a bad idea to do so, since:
a. it would be a lot slower (than a cpu&gpu-combined solution or a cpu-only solution)
b. it would require a lot of thinking (this really slows down things)
c. it would require to implement/code a lot of stuff
If it isn't helpful, please simply ignore this post, is was meant to shed some light on the whole 'GPU is always faster! Why isn't x done completely on the cpu?' thematic.
Cu Selur
Ps.: From my 'gut'-feeling I would say that looking at what modern GPU instruction sets offer implementing a HEVC encoder purely on a gpu would be a fools task. I suspect some parts of the code could be ported to GPU and really speed-up things, but I'm not sure due to the points I mentioned before.
I also suspect that even if you had lets say 10-20 highly capable fools the whole writing a HEVC encoder only based on GPU might take longer than people really care about the format.
PPs.: Also note, that writing a GPU based (or supported) encoder and creating a hardware encoder chip are two totally different things.Last edited by Selur; 21st May 2015 at 23:42.
users currently on my ignore list: deadrats, Stears555 -
GPU kernels need to be set up and launched from the CPU. CUDA and OpenCL 2.0 allow for dynamic parallelism, where GPU kernels can create and run other GPU kernels, but still I think the tasks have to be created and scattered to the CPU, and the results gathered back by the CPU. But let's say we ported everything possible to the GPU... it would run really, really slow. HEVC video encoding is a complex algorithm with many small units of work, and many inter-dependencies between tasks. It's a task scheduling beast, with a lookahead followed by multiple frame encoders that control multiple row encoders which control multiple CTU encoders which call multiple CU encoders... all governed by multiple rate control algorithms and quality algorithms like AQ and CU-tree. Bottlenecks occur when one task takes too long, holding up many other dependent tasks (a frame encoder stalls because a row encoder is stalled because a CU was particularly difficult to encode, stalling all of the other dependent tasks that you've parallelized... they may be moved to the GPU but they're idle, waiting for the slowest task to finish). Even though you can parallelize many tasks (for example, parallelizing the motion estimation and mode decision when encoding a single CU with our PME and PMODE functions), it isn't always faster, as you might have made an early decision on the CPU to use a fast mode (merge, skip). CPUs have MUCH higher single-threaded performance, and x265 loves fast single-threaded performance.
I'm not saying that GPU acceleration of HEVC isn't possible. Selected functions may benefit from GPU acceleration, on the right platforms, done in the right way. But it isn't an easy problem to solve. -
So does this imply if I were to want to build a pure x265 encoding pc on a budget I would be better off using a Pentium G3260 overclocked to 4.2ghz rather than a FX 6300 clocked at 3.5ghz?
What about AVX2, if I were to go from a FX 8320 clocked at 3.5ghz to a i5 4590 clocked at 3.3ghz that has AVX2 support, would performance be significantly faster?
What about Intel's upcoming AVX512 instruction set, do you expect to see major benefits or will only certain small parts of x265 benefit? -
My advice would be to look at x265 benchmarks on the processors you're considering. Anandtech started using x265 for benchmarking, and it would surprise me if other hardware tech sites don't follow suit, as x265 is a perfect workload for processor benchmarking (it's extremely compute intensive, you can expect it to be very widely used by both consumers and professionals, and it has extremely high levels of optimization for all available processor instruction sets). We're not done with our AVX2 work yet (getting close on 8 bit, a bit more work on 10 bit), but today we see ~ a 30% performance boost due to AVX2 assembly code optimization. AVX 512 will be similarly beneficial.
-
I am aware of that. I wasn't implying that you do not implement user requests at all. I just wanted to know if speeding up encoding is a priority for you at this time. Do you have an idea (even a rough estimate) as to when you will be implementing further accelerations?
(I'm trying to figure out whether to wait for x265 to become faster, or simply use x264 to transcode most of the videos I have right now. As I said earlier, I see a definite improvement in quality using x265, at the same bitrates.) -
I am aware of that. I wasn't implying that you do not implement user requests at all. I just wanted to know if speeding up encoding is a priority for you at this time. Do you have an idea (even a rough estimate) as to when you will be implementing further accelerations?
(I'm trying to figure out whether to wait for x265 to become faster, or simply use x264 to transcode most of the videos I have right now. As I said earlier, I see a definite improvement in quality using x265, at the same bitrates.) -
Now that x265 supports multilib CLI as well as CLI + alternative DLL, I wonder how to detect a multilib EXE without trial+error (trying to run a conversion with absent DLL and the other output depth and testing if that failed). Should the version information output routine possibly check the availability of the alternate routines and report accordingly? Maybe like
Code:x265 [info]: build info [Windows][GCC 4.9.2][64 bit] 8bit x265 [info]: alternative output depth(s): 10bit (internal)
Code:x265 [info]: build info [Windows][GCC 4.9.2][64 bit] 8bit x265 [info]: alternative output depth(s): 10bit (external)
-
I'm not sure if this is an X265 bug or a handbrake bug, or whether I'm doing something wrong - when I use avge bitrate 2-pass encoding, the first and second pass are happening at almost the same speed. This is true whether I don't specify any "fast first pass" option (by default I think it is supposed to do that), or even if I give --no-slow-firstpass as an extra option. When I look at the log, assuming that it is showing the settings for the first pass while first pass is happening, the encoder is using the same (slowish) settings that I gave for the second pass. I thought that on the first pass, the encoder uses much faster settings (like dia). And I'm pretty sure that some time back, the log for first pass used to show faster settings than what I had given for 2nd pass.
How do I make X265 do a fast first pass? Could this be an error in handbrake? -
Unless
Code:--slow-firstpass
Code:--me dia --rd 2 --subme 2 --ref 1 --no-amp --analyse none --early-skip
Could this be an error in handbrake?
Since x265 works fine here, my guess is that it at least is no problem with x265 and thus you might want to create a separate thread about your problem.users currently on my ignore list: deadrats, Stears555 -
Right. And would this be reflected in the log? Because when I check the log during first pass, these are not the settings shown. This is what it is showing, which is in fact the same settings that I want the second pass to make:
[13:44:40] + preset: medium
[13:44:40] + options: --high-tier:--ref=9:--no-slow-firstpass:--allow-non-conformance:--me=2:--subme=4:--merange=90:--rc-lookahead=90:--b-adapt=2:--bframes=7
[13:44:40] + profile: auto
[13:44:40] + bitrate: 1100 kbps, pass: 1 -
-
No clue about handbrakes log, but x265 outputs that it is using rd=2,....
users currently on my ignore list: deadrats, Stears555 -
I just tried encoding the same file using your "Hybrid", and am getting a fast first pass. The log shows settings for first pass (Dia, RD=2...), and the FPS encoded is also considerably faster in the first pass. So my guess is that it is a handbrake bug. I'll take it up in their forums.
Good program, btw. -
--zones isn't working for me. What am I doing wrong? The quality difference which should be obvious is not appearing in my video.
Code:avs4x26x.exe --x26x-binary x265 "KK3band.avs" --pass 1 --bitrate 2000 --preset veryslow --allow-non-conformance --zones 0,50,b=0.01 -o "KK3band.hevc" avs4x26x.exe --x26x-binary x265 "KK3band.avs" --pass 2 --bitrate 2000 --preset veryslow --allow-non-conformance --zones 0,50,b=0.01 -o "KK3band.hevc"
-
Have you tried not modifying the first but the last 50 frames?
Have you tried using something like 0.1 instead of 0.01 ? (20kBit/s seems really low and might be too low to be possible depending on the resolution and frame rate)users currently on my ignore list: deadrats, Stears555 -
Yes to both questions. After the first attempt failed I tried lowering the number and doing the first 50 instead of last. I tried q= as well.
I'm using the latest x265 compiled by Ligh.De. Video is 480p and 23.976. -
There seems to be a lot more broken, regarding command line parsing; e.g. --log-level full --help does not create a full help anymore, for a longer time already.
To get a well founded response, I asked in the developer mailing list.Last edited by LigH.de; 10th Jul 2015 at 08:50.
-
There has been a fix patch für the full help output; I hope it will reactivate zones as well. I believe the reason was related to accidently resetting some options to defaults twice, or similar...
Last edited by LigH.de; 10th Jul 2015 at 16:41.
Similar Threads
-
[HEVC] x265.EXE: mingw builds
By El Heggunte in forum Video ConversionReplies: 2221Last Post: 9th Feb 2021, 01:18 -
HEVC Encoder by Strongene Lentoid
By vhelp in forum Video ConversionReplies: 126Last Post: 19th May 2017, 12:58 -
theX.265 (a free HEVC) codec. Have you ever tried that HEVC encoder? (HELP)
By Stears555 in forum Video ConversionReplies: 41Last Post: 16th Sep 2013, 11:15 -
HEVC x265 Decoder
By enim in forum Newbie / General discussionsReplies: 5Last Post: 19th Aug 2013, 12:58 -
MulticoreWare Annouces x265/HEVC Mission Statement
By enim in forum Latest Video NewsReplies: 4Last Post: 9th Aug 2013, 22:09