Uncategorized
Dragonfly: Enhanced Vision-Language Model with Multi-Resolution Zoom Launched by Together.ai | IDOs News
Together.ai has announced the launch of Dragonfly, an innovative vision-language model designed to enhance fine-grained visual understanding and reasoning about image regions. The architecture leverages multi-resolution zoom-and-select capabilities to optimize multi-modal reasoning while maintaining context efficiency, according to Together AI.
Dragonfly Model Architecture
Dragonfly employs two primary strategies: multi-resolution visual encoding and zoom-in patch selection. These techniques enable the model to focus on fine-grained details of image regions, enhancing its commonsense reasoning capabilities. The architecture processes images at multiple resolutions—low, medium, and high—dividing each image into sub-images that are encoded into visual tokens. These tokens are then projected into a language space, forming a concatenated sequence that feeds into the language model.
Zoom-in Patch Selection: Dragonfly employs a selective approach for high-resolution images, identifying and retaining only the sub-images that provide the most significant visual information. This targeted selection reduces redundancy and improves the overall model efficiency.
Performance and Evaluation
Dragonfly demonstrates promising performance on several vision-language benchmarks, including commonsense visual question answering and image captioning. The model achieved competitive results on benchmarks such as AI2D, ScienceQA, MMMU, MMVet, and POPE, showcasing its effectiveness in fine-grained understanding of image regions.
Benchmark Performance:
Model | AI2D | ScienceQA | MMMU | MMVet | POPE |
---|---|---|---|---|---|
VILA | – | 68.2 | 34.9 | 85.5 | |
LLaVA-v1.5 (Vicuna-7B) | 54.8 | 70.4 | 35.3 | 30.5 | 85.9 |
LLaVA-v1.6 (Mistral-7B) | 60.8 | 72.8 | 33.4 | 44.8 | 86.7 |
QWEN-VL-chat | 52.3 | 68.2 | 35.9 | – | – |
Dragonfly (LLaMA-8B) | 63.6 | 80.5 | 37.8 | 35.9 | 91.2 |
Dragonfly-Med
In collaboration with Stanford Medicine, Together.ai has also introduced Dragonfly-Med, a version fine-tuned on 1.4 million biomedical image-instruction data. This model excels in high-resolution medical data tasks, outperforming previous models like Med-Gemini on multiple medical imaging benchmarks.
Evaluation on Medical Benchmarks
Dragonfly-Med was evaluated on visual question-answering and clinical report generation tasks, achieving state-of-the-art results on several benchmarks:
Dataset | Metric | Med-Gemini | Dragonfly-Med (LLaMA-8B) |
---|---|---|---|
VQA-RAD | Acc (closed) | 69.7 | 77.4 |
SLAKE | Acc (closed) | 84.8 | 90.4 |
Path-VQA | Acc (closed) | 83.3 | 92.3 |
Conclusion and Future Work
Dragonfly’s architecture offers a new research direction by focusing on zooming in on image regions to capture more fine-grained visual information. Together.ai plans to continue improving the model’s capabilities and exploring new architectures and visual encoding strategies to benefit broader scientific fields.
The collaboration with Stanford Medicine and the utilization of resources like Meta LLaMA3 and CLIP from OpenAI have been crucial in developing Dragonfly. The model’s codebase also builds upon the foundations of Otter and LLaVA-UHD.
Image source: Shutterstock
. . .
Tags
Uncategorized
Advantages of Mobile Apps in Gambling: The Example of Pin Up App | IDOs News
By Terry Ashton, updated August 31, 2024
Online gambling is going mobile — over 50% of players are already playing casino games on their mobile devices, and their number is expected to grow further. But does a mobile app have actual advantages over browser-based play? We decided to do more profound research by accessing and trying gambling on a desktop browser, mobile browser, and the app. That allowed us to distinguish casino mobile applications’ key benefits and drawbacks. If you’re considering using one, just keep reading — we will share some helpful insights below.
Benefits of Mobile Play at Pin Up Casino
The rise of online gambling happens for multiple reasons, including the following ones:
- Ultimate accessibility. You can access the app anywhere, even on the go. You don’t need to take additional actions — the casino opens with just one click.
- Lower Internet requirements, offline play. If you play for fun, you can do it even without an Internet connection. If you prefer to play real money, the requirements for an Internet connection will still be much lower because most data is already downloaded to your device.
- Push notifications. You can immediately learn about the new top promotions and the hottest games without checking your email.
- Special bonuses. Sometimes, special bonuses are granted to mobile players. Some casinos may add them occasionally to encourage players to play on apps.
- The same game selection. If a casino is modern and cooperates with top providers, all games will be compatible with mobile devices. For instance, if you play at Pin Up casino online, you can access the same collection of games. That goes not only for slots but also for live games, table games, etc.
- Higher security standards. The app is protected even better than the site. Data is encrypted, and the chance that anyone will access your account is close to zero.
Registration also goes smoothly. Once you sign up on the browser or app, you can access the platform with just one click by entering your Pin Up login and password.
Considering the Cons: Potential Drawbacks of Using a Pin-Up Mobile App
Nothing is perfect, and neither are casino apps. Gamblers should also consider the drawbacks, and the most common ones are as follows:
- Installing software is a must. You need to install the software on your phone. It’s safe if it’s the official casino site and a good product. However, clicking on the wrong link and downloading the wrong APK file may result in problems.
- Battery drain and storage space. It’s no secret that charging the phone all the time is annoying, and innovative slots with top graphics may drain your battery quickly. Also, though most apps don’t take much space (in the case of Pin Up, it’s just about 100 Mb), they still require more effort to manage it.
- Compatibility requirements. Any app will have technical requirements, and most aren’t compatible with old mobile devices and tablets. Also, you’ll need to install updates quite regularly.
- Smaller screen. This is a disadvantage for those who prefer playing on larger screens, particularly those who prefer live dealer games.
Do the pros outweigh the cons for you? If yes, the mobile app will boost your experience. If not, browser play may be a better option.
Final Thoughts: The App vs. Browser Play at Pin-Up Casino
Technology is shaping the industry. Nowadays, there’s no such significant difference between playing on a mobile app and a mobile or desktop browser. You get the same game selection, the same bonuses, and the same smooth experience. So, it’s a matter of taste. Choose what will work best for you and enjoy your play.
Uncategorized
NVIDIA Introduces Fast Inversion Technique for Real-Time Image Editing | IDOs News
NVIDIA has unveiled an innovative method called Regularized Newton-Raphson Inversion (RNRI) aimed at enhancing real-time image editing capabilities based on text prompts. This breakthrough, highlighted on the NVIDIA Technical Blog, promises to balance speed and accuracy, making it a significant advancement in the field of text-to-image diffusion models.
Understanding Text-to-Image Diffusion Models
Text-to-image diffusion models generate high-fidelity images from user-provided text prompts by mapping random samples from a high-dimensional space. These models undergo a series of denoising steps to create a representation of the corresponding image. The technology has applications beyond simple image generation, including personalized concept depiction and semantic data augmentation.
The Role of Inversion in Image Editing
Inversion involves finding a noise seed that, when processed through the denoising steps, reconstructs the original image. This process is crucial for tasks like making local changes to an image based on a text prompt while keeping other parts unchanged. Traditional inversion methods often struggle with balancing computational efficiency and accuracy.
Introducing Regularized Newton-Raphson Inversion (RNRI)
RNRI is a novel inversion technique that outperforms existing methods by offering rapid convergence, superior accuracy, reduced execution time, and improved memory efficiency. It achieves this by solving an implicit equation using the Newton-Raphson iterative method, enhanced with a regularization term to ensure the solutions are well-distributed and accurate.
Comparative Performance
Figure 2 on the NVIDIA Technical Blog compares the quality of reconstructed images using different inversion methods. RNRI shows significant improvements in PSNR (Peak Signal-to-Noise Ratio) and run time over recent methods, tested on a single NVIDIA A100 GPU. The method excels in maintaining image fidelity while adhering closely to the text prompt.
Real-World Applications and Evaluation
RNRI has been evaluated on 100 MS-COCO images, showing superior performance in both CLIP-based scores (for text prompt compliance) and LPIPS scores (for structure preservation). Figure 3 demonstrates RNRI’s capability to edit images naturally while preserving their original structure, outperforming other state-of-the-art methods.
Conclusion
The introduction of RNRI marks a significant advancement in text-to-image diffusion models, enabling real-time image editing with unprecedented accuracy and efficiency. This method holds promise for a wide range of applications, from semantic data augmentation to generating rare-concept images.
For more detailed information, visit the NVIDIA Technical Blog.
Image source: Shutterstock
Uncategorized
AMD Radeon PRO GPUs and ROCm Software Expand LLM Inference Capabilities | IDOs News
AMD has announced advancements in its Radeon PRO GPUs and ROCm software, enabling small enterprises to leverage Large Language Models (LLMs) like Meta’s Llama 2 and 3, including the newly released Llama 3.1, according to AMD.com.
New Capabilities for Small Enterprises
With dedicated AI accelerators and substantial on-board memory, AMD’s Radeon PRO W7900 Dual Slot GPU offers market-leading performance per dollar, making it feasible for small firms to run custom AI tools locally. This includes applications such as chatbots, technical documentation retrieval, and personalized sales pitches. The specialized Code Llama models further enable programmers to generate and optimize code for new digital products.
The latest release of AMD’s open software stack, ROCm 6.1.3, supports running AI tools on multiple Radeon PRO GPUs. This enhancement allows small and medium-sized enterprises (SMEs) to handle larger and more complex LLMs, supporting more users simultaneously.
Expanding Use Cases for LLMs
While AI techniques are already prevalent in data analysis, computer vision, and generative design, the potential use cases for AI extend far beyond these areas. Specialized LLMs like Meta’s Code Llama enable app developers and web designers to generate working code from simple text prompts or debug existing code bases. The parent model, Llama, offers extensive applications in customer service, information retrieval, and product personalization.
Small enterprises can utilize retrieval-augmented generation (RAG) to make AI models aware of their internal data, such as product documentation or customer records. This customization results in more accurate AI-generated outputs with less need for manual editing.
Local Hosting Benefits
Despite the availability of cloud-based AI services, local hosting of LLMs offers significant advantages:
- Data Security: Running AI models locally eliminates the need to upload sensitive data to the cloud, addressing major concerns about data sharing.
- Lower Latency: Local hosting reduces lag, providing instant feedback in applications like chatbots and real-time support.
- Control Over Tasks: Local deployment allows technical staff to troubleshoot and update AI tools without relying on remote service providers.
- Sandbox Environment: Local workstations can serve as sandbox environments for prototyping and testing new AI tools before full-scale deployment.
AMD’s AI Performance
For SMEs, hosting custom AI tools need not be complex or expensive. Applications like LM Studio facilitate running LLMs on standard Windows laptops and desktop systems. LM Studio is optimized to run on AMD GPUs via the HIP runtime API, leveraging the dedicated AI Accelerators in current AMD graphics cards to boost performance.
Professional GPUs like the 32GB Radeon PRO W7800 and 48GB Radeon PRO W7900 offer sufficient memory to run larger models, such as the 30-billion-parameter Llama-2-30B-Q8. ROCm 6.1.3 introduces support for multiple Radeon PRO GPUs, enabling enterprises to deploy systems with multiple GPUs to serve requests from numerous users simultaneously.
Performance tests with Llama 2 indicate that the Radeon PRO W7900 offers up to 38% higher performance-per-dollar compared to NVIDIA’s RTX 6000 Ada Generation, making it a cost-effective solution for SMEs.
With the evolving capabilities of AMD’s hardware and software, even small enterprises can now deploy and customize LLMs to enhance various business and coding tasks, avoiding the need to upload sensitive data to the cloud.
Image source: Shutterstock
-
Uncategorized8 months ago
Binance Launches VIP Margin Trading Promo with USDT and Apple Vision Pro Rewards | IDOs News
-
Uncategorized8 months ago
BNB Smart Chain (BSC) Advances with BEP 336: Introducing Blob Transactions for Enhanced Network Performance | IDOs News
-
Uncategorized8 months ago
2024년 남한에서의 암호화폐 스포츠 베팅 | IDOs News
-
Uncategorized8 months ago
State1’s GoldBrick Embarks on Presale Phase to Drive Metaverse Expansion | IDOs News
-
Uncategorized8 months ago
Coin98 (C98) Super Wallet Joins Forces with JamboPhone to Accelerate Web3 Access in Asia | IDOs News
-
Uncategorized8 months ago
Impact Of Fan Tokens On Sports Betting | IDOs News
-
Uncategorized6 months ago
Binance Futures to Update Leverage & Margin Tiers for USDⓈ-M and COIN-M Perpetual Contracts | IDOs News
-
Uncategorized8 months ago
Binance Labs Wraps Up Incubation Season 6 with Strategic Investments in Seven Blockchain Startups | IDOs News